Mass1 gene, a target for anticonvulsant drug development

ABSTRACT

The present invention relates to a novel gene which is associated with audiogenic seizures in mice. The gene is known as the Monogenic Audiogenic Seizure-susceptible gene or mass1. The product of the mass1 gene is designated MASS1. Nucleic acid molecules that encode for MASS1 have been identified and purified. The sequence of murine mass1 can be found at SEQ ID NO: 1, and the sequence of human mass1 can be found at SEQ ID NO: 3. Mammalian genes encoding a MASS1 protein are also provided. The invention also provides recombinant vectors comprising nucleic acid molecules that code for a MASS1 protein. These vectors can be plasmids. In certain embodiments, the vectors are prokaryotic or eukaryotic expression vectors. The nucleic acid coding for MASS1 can be linked to a heterologous promoter. The invention also relates to transgenic animals in which one or both alleles of the endogenous mass1 gene is mutated.

RELATED APPLICATIONS

This application is related to and claims the benefit of U.S.Provisional Application Ser. No. 60/187,209 of Louis J. Ptacek, H. SteveWhite and Ying-Hui Fu, filed Mar. 3, 2000 and entitled “Novel EpilepsyGene is a Target for Anticonvulsant Drug Development,” and U.S.Provisional Application Ser. No. 60/222,898 of Louis J. Ptacek, H. SteveWhite, Ying-Hui Fu, and Shana Skradski filed Aug. 3, 2000 and entitled“Human mass1 Gene” which are incorporated herein by this reference.

FIELD OF THE INVENTION

The present invention relates to the isolation and characterization of anovel gene relating to epilepsy. More specifically, the inventionrelates to the isolation and characterization of the MonogenicAudiogenic Seizure-susceptible gene, hereinafter mass1 gene.

TECHNICAL BACKGROUND

Epilepsy is a common neurological disorder that affects nearly 2.5million people in the United States. Epilepsy is characterized byrecurrent seizures resulting from a sudden burst of electrical energy inthe brain. The electrical discharge of brain cells causes a change in aperson's consciousness, movement, and/or sensations. The intensity andfrequency of the epileptic seizures varies from person to person.

Epilepsies in humans can be separated into two forms, symptomatic andnon-symptomatic. Symptomatic epilepsy is a seizure disorder related to aknown cause such as metabolic disease, brain malformations, or braintumors. In these cases, seizures presumably occur because of a veryabnormal focus (or foci) in the brain. Genetic models of symptomaticepilepsy include the weaver mouse (wv), in which a mutation of the Gprotein-gated inwardly rectifying potassium channel GIRK2 results inneuro-developmental abnormalities and seizures. Signorini, S. et al.(1997), Proc Natl Acad Sci USA 94: 923-7. Fragile X-associated proteinknock-out mice have a neurodevelopmental syndrome with loweredthresholds to audiogenic seizures. Musumeci, S. A. et al.(2000),Epilepsia 41: 19-23. Audiogenic seizures can also be induced inseizure-resistant mice such as C57BL/6 by repetitive sound stimulation,suggesting that seizure-susceptibility can be influenced by multiplegenetic and environmental factors. Henry, K. R. (1967), Science 158:938-40.

Non-symptomatic epilepsies are defined when no structural or metaboliclesions are recognized and the patients have no other neurologicalfindings between seizures. This latter group of patients is more likelyto have primary neuronal hyperexcitability that is not caused bymetabolic, developmental or structural lesions. Molecularcharacterization of electrical hyperexcitability in human musclediseases led to the hypothesis that such disorders might be the resultof mutations in neuronal ion channels, the primary determinants ofneuronal membrane excitability. Ptacek, L. J. et al. (1991), Cell 67:1021-7.

All non-symptomatic human epilepsy syndromes and genetic mouse seizuremodels that have been characterized at a molecular level are caused bymutations in ion channels. Ptacek, L. J. (1999), Semin Neurol 19: 363-9;Jen, J. & L. J. Ptacek (2000), Channelopathies: Episodic Disorders ofthe Nervous System. Metabolic and Molecular Bases of Inherited Disease.C. R. Schriver, A. L. Beaudet, W. S. Sly and D. Valle. New York,McGraw-Hill. pp. 5223-5238; Noebels, J. L. (2000), The InheritedEpilepsies. Metabolic and Molecular Bases of Inherited Disease. C. R.Schriver, A. L. Beaudet, W. S. Sly and D. Valle. New York, McGraw-Hill.pp 5807-5832. Some patients with febrile seizures have been recognizedto have mutations in sodium channel α and β1 subunits while somepatients with epilepsy and episodic ataxia were shown to have calciumchannel β-subunit mutations. Wallace, R. H. et al. (1998), Nat Genet 19:366-70; Escayg, A. et al. (2000), Am J Hum Genet 66: 1531-9; Escayg, A.et al. (2000), Nat Genet 24: 343-5. The voltage-gated potassium channelgenes KCNQ2 and KCNQ3, when mutated, result in benign familial neonatalconvulsions. Biervert, C. et al. (1998), Science 279: 403-6; Charlier,C. et al. (1998), Nat Genet 18: 53-5; Singh, N. A. et al. (1998), NatGenet 18: 25-9. Ligand-gated channels can also result in epilepsy asdemonstrated by mutations in the α4 subunit of the neuronal nicotinicacetylcholine receptor that result in autosomal dominant nocturnalfrontal lobe epilepsy. Steinlein, O. K. et al. (1995), Nat Genet 11:201-3. In mice, the α, β and γ subunits of the voltage-sensitive calciumchannel have been associated with the tottering (tg), lethargic (lh) andstargazer (stg) models of absence seizures. Fletcher, C. F. et al(1996), Cell 87: 607-17; Burgess, D. L. et al. (1997), Cell 88: 385-92;Letts, V. A. et al. (1998), Nat Genet 19: 340-7. Finally, audiogenicseizure-susceptibility has been characterized in a mouse knockout modelof the 5-HT_(2C) receptor; homozygous mice have audiogenic seizures andaltered feeding behavior. Tecott, L. H. et al. (1995), Nature 374:542-6; Brennan, T. J. et al. (1997), Nat Genet 16: 387-90.

The Frings mouse represents one of many strains of mice and rats thatare sensitive to audiogenic seizures (AGS). These AGS-susceptiblerodents represent models of generalized reflex epilepsy and include thewell-studied DBA/2 mouse and GEPR-9 rat. The Frings mouse seizurephenotype is similar to other described audiogenic seizes and ischaracterized by wild running, loss of righting reflex, tonic flexionand tonic extension in response to high intensity sound stimulationSchreiber, R. A. et al. (1980), Genet 10: 537-43. This strain wascharacterized 50 years ago when it arose as a spontaneous mutation onthe Swiss Albino background. Frings, H. et al. (1951), J Mammal 32:60-76. Selective inbreeding for seizure-susceptibility produced thecurrent homozygous Frings strain with >99% penetrance of audiogenicseizures. The Frings mouse seizure phenotype was due to the autosomalrecessive transmission of a single gene.

Audiogenic seizures have been observed in polygenic rodent models, suchas the DBA/2 mouse and GEPR-9 rat. Collins, R. L. (1970), Behav Genet 1:99-109; Seyfried, T. N. et al. (1980), Genetics 94: 701-718; Seyfried,T. N. & G. H. Glaser (1981), Genetics 99: 117-126; Neumann, P. E. & T.N. Seyfried (1990), Behav Genet 20: 307-23; Neumann, P. E. & R. L.Collins (1991), Proc Natl Acad Sci USA 88: 5408-12; Ribak, C. E. et al.(1988), Epilepsy Res 2: 345-55. While no genes associated withaudiogenic seizures in spontaneous mutant models have been cloned, threeputative loci associated with seizure-susceptibility in the DBA/2 mouse(asp1, asp2, and asp3) have been mapped to chromosomes 12, 4, and 7,respectively. Neumann & Seyfried, supra; Neumann, P. E. & R. L. Collins,supra. As a monogenic audiogenic seizures model, the Frings miceprovided a unique opportunity for cloning and characterization of anaudiogenic seizures gene. The Frings mice are an important naturallyoccurring monogenic model of a discrete non-symptomatic epilepsy andprovide significant information on a novel mechanism ofseizure-susceptibility as well as central nervous system excitability ingeneral.

In light of the foregoing, it will be appreciated that it would be anadvancement in the art to identify and characterize nucleic acidsequences that are associated with the monogenic AGS susceptibility inFrings mice. It would be a further advancement to identify andcharacterize the human orthologue of this gene. It would be a furtheradvancement if the nucleic acid sequences could provide additionalunderstanding of how epileptic seizures are triggered in disease. Itwould be a further advancement to provide a transgenic animal modelwherein the endogenous gene associated with the Frings phenotype ismutated.

Such nucleic acid sequences and animals are disclosed and claimedherein.

BRIEF SUMMARY OF THE INVENTION

The present invention relates to an isolated novel gene which has beenimputed in audiogenic seizure-susceptibility in mice known as the mass1gene. Provided herein are nucleic acid molecules that encode the MASS1protein. The nucleic acid molecules of the present invention may alsocomprise the nucleotide sequence for human mass1 (SEQ ID NO: 3) andmurine mass1 (SEQ ID NO: 1). In certain other embodiments, the presentinvention provides nucleic acid molecules that code for the amino acidsequence of human MASS1 (SEQ ID NO: 4) and murine MASS1 (SEQ ID NO: 2).The invention also provides nucleic acid molecules complementary to thenucleic acid molecules of SEQ ID NO: 3 and SEQ ID NO: 1. The inventionalso relates to other mammalian mass1 genes and MASS1 proteins.

The present invention also relates to an isolated nucleic acid having atleast 15 consecutive nucleotides as represented by a nucleotide sequenceselected from the nucleotides of the murine mass1 gene (SEQ ID NO: 1)and the nucleotides of the human mass1 gene (SEQ ID NO: 3). A nucleotidehaving in the range from about 15 to about 30 consecutive nucleotides asrepresented by a nucleotide sequence selected from the nucleotides ofthe murine mass1 gene (SEQ ID NO: 1) and the nucleotides of the humanmass1 gene (SEQ ID NO: 3) is also within the scope of the presentinvention.

The present invention also provides recombinant vectors comprisingnucleic acid molecules that code for MASS1. These recombinant vectorsmay be plasmids. In other embodiments, these recombinant vectors areprokaryotic or eukaryotic expression vectors. The nucleic acid codingfor MASS1 may also be operably linked to a heterologous promoter. Thepresent invention further provides host cells comprising a nucleic acidthat codes for MASS1.

The present invention also relates to a transgenic mammal with amutation in one or both alleles of the endogenous mass1 gene. Themutation in one or both of the endogenous mass1 genes may result in amammal with a seizure-susceptible phenotype. The transgenic mammal ofthe present invention may be a mouse. The mutation may result from theinsertion of a selectable marker gene sequence or other heterologoussequence into the mammal's genome by homologous recombination. Theinvention also provides cells derived from the transgenic mammal.

These and other advantages of the present invention will become apparentupon reading the following detailed description and appended claims.

SUMMARY OF THE DRAWINGS

A more particular description of the invention briefly described abovewill be rendered by reference to the appended drawings and graphs. Thesedrawings and graphs only provide information concerning typicalembodiments of the invention and are not therefore to be consideredlimiting of its scope.

FIG. 1 shows a linkage map of the mass1 locus initially defined bymarkers D13Mit126 and D13Mit200. Markers D13Mit69, 97, and 312 (enclosedin rectangles) were used to genotype the F2 progeny. The estimatedgenetic distances are shown. The location of candidate genes Nhe3, Dat1,and Adcy2 are indicated. The map inset represents the large-scalephysical map of the mass1 interval spanned by yeast artificialchromsomes (YACs). SLC10 and SLC11 are novel SSLP markers, and theothers are STS markers.

FIG. 2 is a fine-scale physical map of the mass1 interval defined bybacterial artificial chromosomes (BACs) and cosmids. SLC-numbers between10 and 100 are novel SSLP markers, and SLC-numbers 100 to 200 are novelSTS markers. The bars above the map represent the genotypes of thenearest recombinant mice. The gray bars represent regions where the miceare recombinant, black filled bars are regions where the mice arenonrecombinant, and white filled bars are regions where the markers werenot informative. The final mass1 interval was spanned by cosmids C13Aand C1B, and the complete genonic sequence was generated between themarkers SLC20 and SLC14. The alignment of the mass1 exons that wereidentified from the sequence are shown at the bottom.

FIG. 3 is a diagram of the mass1 genomic structure showing threeputative transcripts and exons that are included in each transcript. Theshort transcript, mass1.3, has putative 5′ untranslated sequence leadinginto exon 22. Exon 7a and 7b represent two alternate exons that havebeen identified in mouse brain cDNA. The medium transcript, mass1.2, hasputative 5′ untranslated sequence leading into exon 7b, and the longesttranscript, mass1.1, has only been shown to contain exon 7a. A long andshort splice variant was identified in exon 27 (27L and 27S). The 27Svariant removes 83 base pairs and changes the reading frame.

FIG. 4A illustrates expression analysis of the mass1 gene by RT-PCR indifferent tissue and cell RNA samples using primers from exons 23 and24. Analysis of mass1 in multiple tissue RNA samples of a CF1 mouseshows expression is primarily in the brain, kidney, and lung, and not inthe other tissues listed

FIG. 4B illustrates further expression analysis of the mass1 gene byRT-PCR using brain RNA. Mass1 expression was detected in all regions ofthe brain tested.

FIG. 4C illustrates expression analysis by RT-PCR of the mass1 gene withpooled cultured cortical neuron RNA and cultured astrocyte RNA comparedto whole brain. The mass1 specific primers span intron 23 and theexpected product size was 487 base pairs. The β-actin primers alsospanned two exons and the expected product size is 327 base pairs. Theladder is in 100 base pair increments.

FIG. 5A is a sequence chromatogram of the exon 27 segment from C57BL/6Jand Frings DNA. The sequence chromatogram illustrates the identificationof a single base pair deletion found in exon 27 of mass1 sequence ofFrings mice. The Frings mouse DNA contains a single G deletion atnucleotide 7009.

FIG. 5B illustrates high resolution gel electrophoresis of PCR productsfrom a 150 base pair segment of exon 27 encompassing 7009ΔG, showingthat none of the seizure-resistant and seizure-susceptible control mouseDNA samples harbor the deletion present in the Frings mouse.

FIG. 6 illustrates the conceptual amino acid translation of the mass1.1transcript (SEQ ID NO: 5). The 18 MASS1 repetitive motifs are boxed witha solid line and the 2 less conserved possible repeats are boxed with adashed line. The putative multicopper oxidase I domain is underlined.The valine→stop mutation in the Frings MASS1 protein is located at aminoacid number 1072 marked with the “*”.

FIG. 7 illustrates the amino acid sequence alignment of the MASS1repeats. (SEQ ID NOS: 6-23). The first 18 lines represent the wellconserved amino acid repeat motif found in MASS1. Positions of highlyconserved amino acids are shaded gray. The next line shows the consensussequence for the MASS1 repeat (SEQ ID NO: 24), and below it are thesequences of the Na⁺/Ca²⁺ exchanger (β1 and β2) segments that sharehomology with the MASS1 repeat (SEQ ID NOS: 25 & 26). Also shown is ahomologous region of the very large G-protein coupled receptor-1(Accession 55586) (SEQ ID NO: 27). The boxed segment outline the DDDmotif that has been shown to be a Ca²⁺ binding site in the Na⁺/Ca²⁺exchanger β1 segment.

DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to DNA for a novel Monogenic AudiogenicSeizure-susceptible gene (mass1). More particularly, the presentinvention relates to the isolation and characterization of the mousemass1 gene (SEQ ID NO: 1) and the human mass1 gene (SEQ ID NO: 3). Thediscovery that the murine mass1 gene is mutated in Frings mice suggeststhat mass1 has a role in seizure susceptibility.

Nucleotide sequences complementary to the nucleotide sequences of SEQ IDNO: 1 and SEQ ID NO: 3 are also provided. Isolated and purifiednucleotide sequences that code for the amino acid sequence of the mouseMASS1 (SEQ ID NO: 2) protein are also within the scope of the invention.Nucleotide sequences that code for the amino acid sequence of the humanMASS1 (SEQ ID NO: 4) protein are within the scope of the invention. Anucleic acid sequence that codes for MASS1 of any mammal is also withinthe scope of the invention.

The nucleic acid molecules that code for mammalian MASS1 proteins, suchas a human or murine MASS1, can be contained within recombinant vectorssuch as plasmids, recombinant phages or viruses, transposons, cosmids,or artificial chromosomes. Such vectors can also include elements thatcontrol the replication and expression of the mass1 nucleic acidsequences. The vectors can also have sequences that allow for thescreening or selection of cells containing the vector. Such screening orselection sequences can include antibiotic resistance genes. Therecombinant vectors can be prokaryotic expression vectors or eukaryoticexpression vectors. The nucleic acid coding for MASS1 can be linked to aheterologous promoter.

Host cells comprising a nucleic acid that codes for mammalian MASS1 arealso provided. The host cells can be prepared by transfecting anappropriate nucleic acid into a cell using transfection techniques thatare known in the art. These techniques include calcium phosphateco-precipitation, microinjection, electroporation, liposome-mediatedgene transfer, and high velocity microprojectiles.

The Frings mouse is unique among rodent epilepsy models. It is anaturally-occurring single gene model of audiogenic generalized seizureswithout any other associated neurological or behavioral phenotypes.Sequencing of cosmids from the nonrecombinant mass1 interval identifieda single gene. Until recently, computer-based BLAST nucleotide sequencesimilarity searches did not identify significant similarity between themass1 sequence and any other sequences in the databases. The deficiencyof mass1 cDNA sequence in the databases further supports the hypothesisthat mass1 is expressed in low abundance in the brain or that it isdegraded very rapidly. This hypothesis is based on the fact thatscreening two independent brain cDNA libraries for the mass1 cDNA didnot produce any positive clones, and low message levels were furthersupported by Northern blots, RT-PCR, and in situ hybridization. The lowabundance could be due to low expression of the mass1 mRNA, or to themessage being unstable and quickly degraded.

The mass1 gene was identified by positional cloning and sequencing, exonprediction, RT-PCR and PCR-based 5′ and 3′ RACE. Screening several cDNAlibraries by hybridization had not identified a mass1 CDNA clone.Despite not finding a cDNA clone in the cDNA libraries, convincing dataimplicates mass1 as the gene causing AGS in the Frings mice. Mass1 isthe only gene found in the small non-recombinant mass1 interval. ThecDNA from both mouse and human Marathon cDNA libraries (Clontech, PaloAlto, Calif.) can be amplified. The intron-exon boundaries are conservedfor the genonic structure of hMass1. The alternate transcript of mousemass1 exon 27 is also found in hMass1. The mass1 transcripts containlong open reading frames which are disrupted by a single base-pairdeletion in the Frings mouse.

PCR approaches have been required to clone all or parts of other genessuch as the melatonin receptor. Reppert, S. M. et al. (1994), Neuron 13:1177-85. In such cases, results must be viewed with caution because ofartifacts inherent with PCR-based assays. Problems include producinginaccurate sequence due to Taq DNA polymerase errors and errors due toamplifying parts of homologous genes. To avoid these problems, the mass1final sequence was compiled from segments amplified with a high fidelityPfx DNA polymerase (Gibco) to produce accurate sequence from multipletemplates. The mass1 cDNA sequence matched exactly with predicted exonsfrom genomic sequencing of cosmids C1B, C13A, and C20B (FIG. 2).

The homology of the MASS1 protein sequence repetitive motifs to thesodium⁺-calcium²⁺ exchanger (Na⁺/Ca²⁺ exchanger) β1 and β2 repeatdomains may provide an important clue toward identifying the function ofthis novel protein. Although the identity between these proteins islimited to a short segment of the cytosolic loop of the exchanger, it islikely to be functionally significant in MASS1 because this motif isrepeated 18 times within the protein sequence (FIGS. 6 and 7). TheNa⁺/Ca²⁺ exchanger is a plasma membrane associated protein thatco-transports three sodium ions into a cell and one calcium ion out ofthe cell using the sodium electrochemical gradient. Nicoll et al.,supra. The Na⁺/Ca²⁺ exchanger can be regulated by intracellular calciumat a Ca²⁺ binding site on the third cytosolic loop that is distinct fromthe Ca²⁺ transport site. This binding site is composed of threeaspartate residues (DDD) (FIG. 7). When Ca²⁺ is bound at this site, thetransporter is activated. Matsuoka, S. et al. (1993), Proc Natl Acad SciUSA 90: 3870-4; Levitsky, D. O. et al. (1994), J Biol Chem 269:22847-52; Matsuoka, S. et al. (1995), J Gen Physiol 105: 403-20 . One ofthe MASS1 repeats contains the DDD motif, and three others haveconservative D to E substitutions suggesting that these domains may beinvolved in Ca²⁺ binding.

The multicopper oxidase I consensus sequence identified within the MASS1amino acid sequence is also an interesting putative functional domain.The multicopper oxidases represent a family of proteins that oxidizesubstrates while reducing molecular O₂ to H₂O. The oxidation of multiplesubstrate molecules occurs serially while storing electrons in thecopper atom (presumably to prevent the formation of reactive species)until a molecule of O₂ is reduced. Two known multicopper oxidases, Fet3pin yeast and ceruloplasmin in humans, have been shown to oxidize andtransport iron. Askwith, C. et al. (1994), Cell 76: 403-10; Harris, Z.L. et al. (1995), Proc Natl Acad Sci USA 92: 2539-43. A thirdmulticopper oxidase, hephaestin has been suggested to be a feroxidase.Vulpe, C. D. et al. (1999), Nat Genet 21: 195-9. Other known multicopperoxidase substrates include Mn²⁺, serotonin, epinephrine, dopamine, and(+)-lysergic acid diethylamide (LSD). Zaitsev, V. N. et al. (1999), JBiol Inorg Chem 4: 579-87; Brouwers, G. J. et al. (1999), Appl EnvironMicrobiol 65: 1762-8. Therefore, loss of this putative functional domaincould possibly result in problems with the metabolism of iron or othermetals, copper sequestration, neurotransmitter processing, and/oroxidative stress. Furthermore, the tyrosine kinase and cAMP/cGWPdependent phosphorylation sites may be functionally significant.However, with a large protein such as MASS1, similarities and identitiesto functional domains commonly occur by chance, and detailed biochemicalanalysis of the protein will be required to determine which of thesemotifs are functional domains.

The human orthologue of the mass1 gene resides on chromosome 5q.Interestingly, a gene causing a human epilepsy has also been mapped tothis region of chromosome 5. This locus, FEB4, was mapped in familieswith a phenotype of febrile convulsions. Nakayama, J. et al. (2000), HumMol Genet 9: 87-91. While this temperature-sensitive phenotype isdifferent than audiogenic seizures, hmass1 will be an importantcandidate to test in the FEB4-linked families.

To date, all genes that have been shown to cause non-symptomaticepilepsies have encoded ion channels (voltage- or ligand-gated andexchangers). Jen & Ptacek, supra; Noebels, supra. The mass1 genetherefore represents the first novel gene shown to cause anon-symptomatic epilepsy. The seizures in the Frings mice are differentfrom those recognized to be caused by ion channels. The phenotype is areflex epilepsy with seizures in response to loud auditory stimuli. Thissuggests that the genesis of episodes may be in brainstem rather thanbeing due to hyperexcitability of cortical neurons. There is a growingappreciation of the role that deep brain structures and brainstem playin the integration and modulation of cortical discharges. For example,normal synchronized discharges are seen in EEGs of sleeping individuals.Perhaps some of the reflex epilepsies in humans are not the result ofprimary cortical hyperexcitability, but rather, of abnormal function ofcircuits critical for integration and modulation of cortical activity.Much work will be required to test this hypothesis, but some fascinatingepisodic CNS disorders have clinical and electrical manifestation thatmay be consistent with this idea. Fouad, G. T. et al. (1996), Am. J.Hum. Genet. 59: 135-139; Ptacek, L. J. (1998), Genetics of FocalEpilepsies. P. Genton. London, John Libbey. pp 203-13; Plaster, N. M. etal. (1999), Neurology 53: 1180-3; Swoboda, K. J. et al. (2000),Neurology 55: 224-30.

Identification and characterization of the mass1 gene reveals it to benovel and rare transcript. Further research to determine the function ofMASS1 will lead to understanding of how a defect in this protein resultsin seizures in these audiogenic seizure-susceptible mice. From the mousemass1 cDNA, a partial human mass1 homolog has been identified. Throughmapping and characterization of the human homolog, it may be possible tofind an association of mass1 with a human epilepsy disorder. Together,the studies of the mouse and human MASS1 will provide insight into thefunction of this novel protein and is likely to lead to new insightsinto normal neuronal excitability and dysfunction of membraneexcitability that can lead to seizures and epilepsy.

The present invention also provides transgenic mice in which one or bothalleles of the endogenous mass1 gene are mutated. Such animals areuseful for example to further study the physiological effects of thisgene or to test potential drug candidates.

Methods for making such transgenic animals are known in the art. See,e.g., Hogan et al., Manipulating the Mouse Embryo: A Laboratory Manual(2d ed. 1994); Hasty et al. (1991), Nature 350:243-246; Mansour et al.(1988), Nature 336:348-352. Briefly, a vector containing the desiredmutation is introduced into mouse embryonic stem (ES) cells. In some ofthese stem cells, the desired mutation may be introduced into the cell'sgenome by homologous recombination. Stem cells carrying the desiredmutation may be identified using selection and/or screening procedures.Such cells are then injected into a blastocyst, which may develop into achimeric mouse with some of the mouse's cells carrying the desiredmutation. A chimeric animal carrying germ cells with the desiredmutation may be bred to produce mutant offspring.

Vectors containing a desired mutation may be produced using methodsknown in the art. See, e.g., 1-3 Sambrook et al., Molecular Cloning: ALaboratory Manual (2d ed. 1989). Such vectors would typically include aportion of the mouse mass1 gene to facilitate homologous recombinationbetween the vector and endogenous gene sequences. A selectable markermay be used to disrupt the coding sequence or an expression controlelement of the mass1 gene. Suitable selectable markers are known in theart. For example, the Neomycin resistance gene (neo), which encodesAminoglycoside phosphotranferase (APH), allows selection in mammaliancells by conferring resistance to G418 (available from Sigma, St. Louis,Mo.). Other suitable markers may also be used to disrupt the mass1 gene.Techniques have also been developed to introduce more subtle mutationsinto genes. See, e.g., Hasty et al., supra.

Vectors may also include sequences to facilitate selection or screeningof ES cells in which the desired mutation has been introduced byhomologous recombination. For example, a vector may include one or morecopies of a gene such as the herpes simplex virus thymidine kinase gene(HSV-tk) upstream and/or downstream of the mass1 gene sequences. Asillustrated in Mansour et al., supra, random integration events wouldlead to incorporation of the HSV-tk gene into the ES cell genome, whilehomologous recombination events do not. ES cells carrying randomlyintegrated vectors (and, therefore, HSV-tk), may be selected against bygrowing the cells in a medium supplemented with gancyclovir.

A vector containing the desired mutation may be introduced into ES cellsin any of a number of ways. For example, electroporation maybe used. SeeMansour et al., supra. Other techniques for introducing vectors intocells are known in the art, including viral infection, calcium phosphateco-precipitation, direct micro-injection into cultured cells, liposomemediated gene transfer, lipid-mediated transfection, and nucleic aciddelivery using high-velocity microprojectiles. Graham et al. (1973),Virol. 52:456-467; M. R. Capecchi (1980), Cell 22:479-488; Mannino etal. (1988), BioTechniques 6:682-690; Felgner et al. (1987), Proc. Natl.Acad Sci. USA 84:7413-7417; Klein et al. (1987), Nature 327:70-73.

Techniques for preparing, manipulating, and culturing ES cells have beendescribed. See, e.g., Hogan et al., supra; Mansour et al., supra. EScells carrying the desired mutation may be identified by screening orselection methods that are known in the art, including growth inselective media and screening using PCR-based or DNA hybridization(Southern blotting) techniques.

In order to better describe the details of the present invention, thefollowing discussion is divided into six sections: (1) fine mapping andphysical mapping of mass1; (2) candidate gene indentification; (3)cloning and analysis of mass1 CDNA; (4) mapping of the hMass1 gene; (5)identification of a mass1 mutation in DNA from Frings mice; and (6)analysis of the mass1 translated protein sequence.

6.1 Fine Mapping & Physical Mapping

Referring to FIG. 1, the mass1 interval between D13Mit200 to D13Mit126was estimated to be 3.6 cM with the initial set of 257 N2 mice tested.Skradski, S. L. et al. ( 1998 ), Genomics 49: 188-92. Approximately 1200additional (Frings X C57BL/6J)F1 intercross mice were genotyped withmicrosatellite markers D13Mit312, D13Mit97, and D13Mit69 that span theinterval. Analysis of the recombinations determined that the mass1region was distal to the D13Mit97 marker and proximal of D13Mit69. Twoadditional microsatellite markers, D13Mit9 and D13Mit190, wereidentified within this interval from the Chromosome 13 Committee map.Genotyping of the border-defining recombinant mice with these markersnarrowed the interval to between D13Mit9 and D13Mit190. Of the 1200 F2mice, three were recombinant at D13Mit9 and ten mice were recombinant atD13Mit190. No other known simple sequence length polymorphisms (SSLPs)markers were mapped within this interval.

This distance between the markers D13Mit9 and D13Mit190 was covered bythree overlapping YACs 151C12, 87F11, and 187D1 found on the contigWC13.27. These YACs contained four known sequence-tagged sites (STSs),SLC106, SLC117, SLC111 and SLC105 shown in FIG. 2. The four STSs wereused to identify BACs from the BAC library. A new single nucleotidepolymorphisms was screened by sequencing small-insert pUC19 subclonelibraries of the BACs. Two newly identified polymorphic markers, SLC10and SLC11, were identified and further narrowed the distal border anddefined the mass1 interval to the distance spanned by a single YAC,151C12, between markers SLC11 and D13Mit9 as shown in FIG. 1.

Since no known SSLPs or STSs were contained within the mass1 interval, aphysical map of the region was constructed by using end sequences of BACclones to develop new STSs to re-screen the library for overlappingBACs. Simultaneous with the physical mapping, identification of SSLPsfrom the new BACs continued to narrow the interval. Seven overlappingBACs were required to cover the distance between SLC11 and D13Mit9.SSLPs from each end of the insert of BAC 290J21, SLC14 and SLC15, wererecombinant and localized the mass1 gene to this small region as shownin FIG. 2. Based on the insert size of the BAC, this narrowed the mass1region to less than 150 Kb.

This BAC insert was subcloned into both a cosmid vector and pUC19.Sequences from randomly selected pUC19 clones were used to develop newSTSs across the BAC, and these new markers were then used to aligncosmids into a complete contiguous map of BAC 290J21 as shown in FIG. 2.SSLP screening of the pUC19 library detected five new repeat markerswithin BAC 290J21 (SLC16-20). Two of these, SLC19 and SLC20, were mappedwithin the mass1 interval. Analysis of recombinants at these markersshowed a recombination with SLC20 that refined the interval to twooverlapping cosmids, C1B and C13A, between the markers SLC14 and SLC20each with a single recombinant mouse (5a9 and 2 d11).

6.2 Candidate Gene Identification

Intragenic STS markers were developed for known candidate genes (Dat1,Adcy2, and Nhe3) that mapped to the general region containing mass1. PCRanalysis of the STSs showed that none of the YACS, BACs or cosmidscomprising the physical map contained these genes. To directly identifycandidate genes from the two cosmids, C1B and C13A, mouse brain cDNAlibraries were screened by hybridization using cosmid DNA as probe. Thelibrary screening experiments were unsuccessful at identifying anycandidate cDNAs from the region, therefore, an alternate strategy ofshot-gun subcloning and sequencing of cosmids C1B and C20B was employed.

The cosmid sequences were edited and compiled to produce the completegenomic sequence from marker SLC14 to SLC20. The complete nonrecombinantmass1 interval was approximately 36 Kb. Analysis of the sequence by theexon-finding program, Genefinder, predicted one multiple-exon genespanning the mass1 interval oriented from the distal to proximal end.Reverse transcription-PCR (RT-PCR) with primers spanning putativeintrons amplified products of the appropriate sizes from Frings andC57/BL/6J total brain RNA. Sequence analysis of these bands confirmedthat they matched the genomic sequence within the exons and identifiedthe first intronexon boundries.

6.3 Cloning and Analysis of mass1 cDNA

RT-PCR experiments produced 1 Kb of open reading frame that could beamplified from mouse brain RNA. Subsequently, rapid amplification ofcDNA ends (RACE) defined the 3′ end of the gene which contained 330 basepairs of untranslated sequence from the first stop codon to the polyAtail. Multiple 5′ RACE reactions produced the complete cDNA sequence ofmass1 and identified three putative alternate transcripts eachcontaining a unique 5′ untranslated sequence. When the cDNA sequence wasaligned with 36 Kb of complete genomic sequence from cosmid C1B, 15exons were noted to correspond the 3′ end of the cDNA sequence; primerswere designed from the remaining 5′ cDNA sequence and used to sequencecosmid C20B. Analysis of this genomic sequence revealed 20 exons asshown in FIG. 2. Thus the longest transcript is composed of 35 exons.

The mass1 gene encodes three putative alternate transcripts. The longesttranscript is approximately 9.4 Kb, the second 7.1 Kb, and the shortest3.7 Kb. Northern blot analyses of mouse RNA failed to produce conclusivedata to confirm these transcript sizes and suggested that the transcriptlevels were very low. However, several autoradiograms with very longexposure times (3-4 weeks) suggested that the 9.4 and 7.1 Kb transcriptsare expressed in mouse brain (data not shown). In situ hybridizationsusing a 3 Kb product from the 3′ end of the cDNA to probe mouse braindid not reveal any signal above background further suggesting the mRNAlevels to be very low.

Each putative transcript contains a unique 5′ untranslated regionleading into the rest of the gene sequence. All three transcriptscontain a possible splice variant in exon 27 where 83 base pairs ofsequence are either included (27L) or removed (27S) from the transcriptas illustrated in FIG. 3.

Referring to FIG. 4A, analysis of the expression of mass1 in mousetissues by RT-PCR of brain, heart, kidney, liver, lung, muscle,intestine, and spleen RNA shows that the gene is predominantly found inthe brain, lung, and kidney. Further analysis of the adult mouse brainshowed ubiquitous mass1 expression throughout the mouse brain regionincluding hippocampus, brain stem, cerebellum, midbrain and cortex asshown in FIG. 4B. Reverse transcription and PCR revealed mass1transcripts to be present in RNA isolated from cultured astrocytes andin RNA aspirated and isolated from single mouse cultured corticalneurons as shown in FIG. 4C.

6.4 Mapping of the hMass1 Gene

A human genomic clone containing the human homolog of the mass1 gene wasidentified by screening a BAC library by PCR with primers from the mousemass1 gene under lower stringency. This clone was used in flourescent insitu hybridization experiments and mapped to human chromosome 5q14.

6.5 Identification of a mass1 Mutation in DNA from Frings Mice

Seventeen single nucleotide polymorphisms (SNPs) were identified betweenFrings and C57BL/6J mice within the nonrecombinant coding region, exons21 to 35. One of these SNPs was a single base pair deletion detected inthe Frings mouse mass1 gene by sequence analysis of PCR products. FIG.5A shows the sequence chromatogram of this single G deletion at position7009 in the Frings mouse DNA sample compared to the seizure-resistantcontrol C57BL/6J. This deletion results in a frame shift of the openreading frame changing the valine to a stop codon; this change isexpected to produce a truncated MASS1 protein in Frings mice. Furtheranalysis of the deletion in other mouse strains by gel electrophoresisshowed that the deletion is only detected in Frings mouse DNA and not inany of the other seizure-resistant or seizure-susceptible mouse strainstested as shown in FIG. 5b. The deletion is located in exon 27 beforethe long and short splice variants. Of the other SNPs identified, sixaltered the amino acid sequence of the protein and could, theoretically,be the genetic basis of Frings audiogenic seizure-susceptibility.Otherwise, these changes represent polymorphisms that may produce subtlealterations in the function of the protein.

6.6 Analysis of the mass1 Translated Protein Sequence

The mass1 gene produces three putative transcripts: mass1.1 (9.4 Kb),mass1.2 (7.1 Kb), and mass1.3 (3.7 Kb). The long transcript contains9327 nucleotides and is expected to produce an approximately 337kilodalton (kD) protein. The medium transcript contains 6714 nucleotidesand the predicted protein size is 244 kD. The short transcript openreading frame is 2865 nucleotides and the predicted protein size isapproximately 103 kD. These transcripts and isoforms are based onincorporation of the longer splice form of exon 27 (27L). Furtherputative variants are possible as a result of the 27S alternate splicingevent. Using the 27S exon theoretically shortens all the transcripts by83 nucleotides and each of the isoforms by 645 amino acids(approximately 69.4 kD). The conceptual translation of the amino acidsequence for the mass1.1(27L) transcript is shown in FIG. 6. The MASS1protein is strongly acidic and has a −192 charge at pH 7.0. Thehydropathy plot indicated numerous hydrophobic domains that arecandidates for transmembrane segments.

Database searches using the mass1.1 sequence identified no expressedsequence tags (ESTs) that were identical and no homologous genes.However, a small repetitive motif from MASS1 shared homology withnumerous Na⁺/Ca²⁺ exchangers. This homology was to the β1 and β2 repeatsin the third cytosolic loop of the exchanger that contains the Ca²⁺regulatory binding domain. Nicoll, D. A. et al. (1996), Ann NY Acad Sci779: 86-92. Further analysis of MASS1 determined that this motif occurs18 times within the sequence. Alignment of these sequences shows severalhighly conserved amino acids within this motif (FIG. 7) including aProline-Glutamate-X-X-Glutamate (PXXE) amino acid sequence (SEQ ID NO:28) that is preceded by one to three acidic residues (D or E). Theproline and first glutamate are completely conserved in all 18 relatedmotifs, and the second glutamate is conserved in 16 of the motifs. Inrepeats 10 and 11, a lysine is substituted for the second glutamate. ThePEXXE motif occurs twice more within the MASS1 sequence, however, theserepeats (repeats 19 and 20) have a lower degree of identity andsimilarity (FIG. 6).

Three aspartic acid residues (DDD) are found in the Na⁺/Ca²⁺ exchangerβ1 segment and in the segment of the very large G-protein coupledreceptor-1 directly preceding the PEXXE motif. In the MASS1 repeat,however, this DDD motif is not well conserved with only repeat number 3containing the exact DDD motif, and repeats 1, 9, and 18 containingconservative substitutions of glutamate residues. The 18 repeats aredistributed across the MASS1 protein and repeats 14 to 18 would bemissing from the truncated MASS1 protein (FIG. 6).

Analysis of the MASS1 sequence by Pattern Match identified a multicopperoxidase I consensus sequence site in the carboxyl-terminal region ofMASS1. The multicopper oxidase I site is located in exon 29 (FIG. 6),within the region of the MASS1 protein that would be truncated by theFrings 7009ΔG mutation. Frings mice would therefore be lacking thispotentially important domain. Biochemical analysis of this putativedomain will determine if this is a functional multicopper oxidase Idomain. Other less common motifs found within MASS1 include threetyrosine kinase phophorylation motifs, two cAMP/cGMP-dependentphosphorylation motifs, and one glycosaminoglycan attachment motif.Finally, numerous common putative protein modification sites wereidentified including casein kinase II phosphorylation, protein kinase Cphosphorylation, N-myristylation, and N-glycosylation sites. Furtheranalysis of the MASS1 protein will be required to determine if any ofthese consensus sites are functional.

All patents, publications, and commercial materials cited herein arehereby incorporated by reference.

EXAMPLES

The following examples are given to illustrate various embodiments whichhave been made with the present invention. It is to be understood thatthe following examples are not comprehensive or exhaustive of the manytypes of embodiments which can be prepared in accordance with thepresent invention.

Example 1 Mouse Breeding, Seizure Testing and DNA Collection

Frings mice were crossed to the seizure-resistant stain C57BL/6J toproduce F1 animals which, in turn, were intercrossed to generate 1200 F2offspring. The Frings mice used in this study were bred in our colonyand the C57BL/6J mice were supplied by the Jackson Laboratory (BarHarbor, Me.). All mice were phenotyped at postnatal day 21 asseizure-susceptible or seizure-resistant as described previously.Skradski, S. L. et al., supra. Directly following seizure phenotyping,tail sections were cut for DNA preparation. Potential recombinant micewithin the region were tested again to confirm the seizure phenotype, asecond tail section was cut, and the mice were euthanized by CO₂ andbilateral thoracotomy. Spleens were harvested for DNA preparation byphenol/chloroform extraction and ethanol precipitation

Example 2 Fine Mapping

All known MIT microsatellite markers between cD13Mit200 and D13Mit126were identified from the Chromosome 13 Committee map publicly availableat the Mouse Genome Informatics Website. All F2 mice were initiallytested with polymorphic markers D13Mit312, D13Mit97, and D13Mit69 toidentify recombinant mice in the mass1 region, and the new recombinantmice were genotyped with additional markers, D13Mit9 and D13Mit190.Primer sequences and information for the markers was obtained from theWhitehead Institute Database site Genetic and Physical Maps of the MouseGenome. Primer synthesis and SSLP analysis was performed as previouslydescribed. Skradski, S. L. et al., supra.

Example 3 Yeast Artificial Chromosomes

YAC maps spanning the region were obtained from the Genetic and PhysicalMaps of the Mouse Genome website. YACs which appeared to contain SSLPmarkers known to be within the region were obtained from ResearchGenetics and YAC DNA was prepared by standard techniques. Haldi, M. L.et al. (1996), Mamm Genome 7: 767-9; Silverman, G. A. (1996), Methods inMolecular Biology, Vol. 54. D. Markie. Totowa, N.J., eds. Humana PressInc. pp 65-68. All STSs shown to be associated with each YAC clone fromthe map were synthesized and tested to confirm that the clones werecorrect and aligned with overlapping YAC clones. Standard PCR conditionsfor physical mapping analyses were 10 mM Tris-HCl, 50 mM NaCl, 1.5 mMMgCl, 30 μM dNTPs, 0.5 μM of forward and reverse primers, and 50 ng ofDNA in a 25 μL reaction volume. PCR thermocycles were 94° C. for 2minutes, followed by 35-40 cycles of 94° C. for 10 seconds, 54° C. for30 seconds, and 72° C. for 30 seconds with a 5 minute final extension at72° C.

Example 4 Bacterial Artificial Chromosomes

BACs were identified and isolated from the PCR-based mouse BAC libraryavailable from Research Genetics using all known STSs and SSLPs found inthe region on linkage and YAC maps. BAC DNA was prepared usingpurification columns by the recommended procedure (Magnum columns,Genome Systems, Inc). BAC end sequence was obtained using T7 and SP6primers. Individual BAC insert sizes were determined by completedigestion of the BAC DNA with NotI and separating the fragments on a1.0% agarose gel in 0.5×TBE circulating buffer. The field inversion gelelectrophoresis (FIGE) program was 180 volts forward, 120 volts reverse,0.1 seconds initial switching time linearly ramped to 3.5 secondsswitching time for 16 hours.

Example 5 Simple Sequence Length Polymorphism (SSLP) Identification

BAC DNA was partially digested with Sau3A1 into fragments ranging from 1to 3 Kb and subcloned into the Bam1 site of pUC18 with the Ready-To-Gocloning kit (Amersham Pharmacia Biotech). New repeats were identified byplating the subclone library, lifting duplicate Hybond-N membranes(Amersham Pharmacia Biotech), and hybridizing with (CA)₂₀ and (AT)₂₀oligonucleotides end-labeled with γ³²P-ATP. Hybridized membranes wereexposed to autoradiographic film. Clones producing a positive signalwere sequenced and primer pairs were designed to amplify new repeatsequences. New SSLP markers were tested with control and recombinantmice to finely map the interval.

Example 6 Cosmid Subcloning

BAC 290J21 was partially digested with Sau3A1 into 30-40 Kb fragmentswhich were subcloned into cosmids as per the instructions for theSuperCos 1 cosmid vector kit (Stratagene) and packaged with Gigapack mGold Packaging Extract (Stratagene) using XL1-Blue mrf′ competent cells.Cosmids were then aligned by amplification with all STSs across theregion. Cosmid sequencing was performed by standard techniques using1200 ng of cosmid DNA and 3.2 pmole of gene-specific mass1 oligosranging from 18 to 24 nucleotides in length.

Example 7 Identifying and Cloning the mass1 Gene

The mass1 cDNA was identified by reverse transcription-PCR (RT-PCR)using primers developed from sequence of exons predicted by Genfinder.Total RNA was prepared from whole mouse brain of C57BL/6J, Frings and F1mice with Trizol reagent as per instructions (Molecular Research Center,Inc.). The standard reverse transcription reaction conditions were 1.0μg RNA, 15 ng random hexamers, 1×First Strand Buffer, 10 mM DTT, 1 mMdNTPs, 40 U RNAse Inhibitor, and 200 U Superscript II reversetranscriptase (Gibco BRL). First strand cDNAs were amplified using pfxDNA polymerase (Gibco BRL) and multiple reactions were sequenced foreach. Since the entire gene was not contained within the genomicsequence that was generated, 5′- and 3′-RACE was used to identify theremaining cDNA sequences.

Example 8 Reverse Transcription-PCR

The RT reactions to determine tissue specificity of mass1 expressionwere performed as described in the previous section on samples from CF1(Charles Rivers, Wilmington, Mass.), C57BL/6J (The Jackson Laboratory,Bar Harbor, Me.), or Frings mouse tissues and cells. The tissue panelsamples were isolated from a single C57BL/6J mouse. The neuronal cDNAwas produced from the pooled cellular extracts of 4-6 CF1 mouse culturedcortical neurons, and the astrocyte cDNA from CF1 astrocyte culture RNAextracted with Trizol reagent (Molecular Research Center, Inc). PCRconditions to amplify the cDNAs were 10 mM Tris-HCl, 50 mM KCl, 1.5 mMMgCl, 30 μM dNTPs, 0.5 μM of forward and reverse primers, and 1 μL ofthe cDNA in a 25 μL reaction volume. PCR thermocycles were 94° C. for 2minutes, followed by 25 (β-actin primers) or 40 (mass1 primers) cyclesof 94° C. for 10 seconds, 54° C. for 30 seconds, and 72° C. for 30seconds with a 5 minute final extension at 72° C. The mass1 primersspanned from exon 22 to exon 23, the forward was 5′ CAG AGG ATG GAT ACAGTA C 3′ (SEQ ID NO: 29) and the reverse was 5′ GTA ATC TCC TCC TTG AGTTG 3′ (SEQ ID NO: 30) and the expected product size was 487 base pairs.The β-actin primers also spanned an intron and were forward 5′ GCA GTGTGT TGG CAT AGA G 3′ (SEQ ID NO: 31) and reverse 5′ AGA TCC TGA CCG AGCGTG 3′ (SEQ ID NO: 32) and the expected product size was 327 base pairs.PCR products for each tissue were mixed and separated by gelelectrophoresis on 2% agarose gels in 1×TAE buffer at 120V, and thebands visualized by staining with ethidium bromide using an ultraviolet(UV) light source.

Example 9 Polymorphism and Mutation Identification

For SSCP, the mouse DNA samples A/J, AKRIJ, BALB/cJ, C57BU/6J, C3H/HeJ,CAST/EiJ, LP/J, NON/LtJ, NOD/LtJ, SPRET/EiJ, and DBA2/J were supplied bythe Jackson Laboratory (Bar Harbor, Me.). The CF1 mice were supplied byCharles Rivers (Wilmington, Mass.), and the seizure-susceptible EL, EP,and SAS mice were supplied by Dr. T. Seyfried (Boston College, Boston,Mass.). PCR reactions were identical to those conditions listed aboveexcept 0.3 μL of α³²P-dCTP was included in a 10 μL total reactionvolume. A 30 μL aliquot of dilution buffer (0.1% SDS/10 mM EDTA inddH₂O) was added to the PCR reactions. A 10 μL aliquot of the dilute PCRreaction was mixed with 10 μL of loading dye (bromophenol blue/xylenecyanol) and 2 μL samples were separated by non-denaturingelectrophoresis on an 9% bis-acrylamide, 10% glycerol, nondenaturing gelat 20W for 14 hours at room temperature with a fan. The PCR forwardprimer sequence was 5′ TTT ATT GTA GAG GAA CCT GAG 3′ (SEQ ID NO: 33)and the reverse primer sequence was 5′ GCC AGT AGC AAA CTG TCC 3′ (SEQID NO: 34) and the expected product size was 126 base pairs. Exon 27 PCRproducts were sequenced to determine that the aberrant band was due to asingle G deletion in the Frings mouse mass1 gene as shown for C57BL/6and Frings mouse DNA.

Example 10 MASS1 Amino Acid Sequence Analysis

The amino acid sequence of MASS1 was deduced from the nucleotidesequence of the cloned mass1 cDNA by DNA Star. The amino acid sequencewas compared to known proteins by BLAST sequence similarity searchingavailable on the website of the National Center for BiotechnologyInformation of the National Institutes of Health. Identification offunctional domains utilized PSORT II Prediction, Sequence Motif Search,Global and Domain Similarity Search, and Pattern Match.

Example 11 Identification and Mapping of a BAC Containing the hmass1Gene

Human mass1 was detected by a relaxed RT-PCR. Several primer setscorresponding to different exons of mouse mass1 were used to amplifyhuman fetal brain cDNA. PCR conditions were the same as in mouseamplifications with an exception of the annealing temperature of 47° C.These primers were used to identify a human genomic clone containing apart of the hMass1 gene (CITB human BAC library).

Human lymphoblast cultures were treated with 0.025 mg/ml cholcimid at37° C. for 1.5 hr. Colcimid treated cultures were pelleted at 500×g atroom temperature for 8 min. Pellets were then re-suspended with 0.075MKCl, 3 ml per pellet 15 minutes at room temperature. Cells were thenfixed in 3:1 MeOH:acetic acid and stored at 4° C. Human BACs werelabeled with spectrum orange using a nick translation kit per themanufacturers protocol (Vysis, Downers Grove, Ill.). Slides wereprepared by dropping fixed cells onto glass slides and washing withexcess fixative. The slides were then washed in acetic acid for 35 minat room temperature and dehydrated in 70%, 85%, and finally 100% EtOH (2min each). Chromosomes were denatured in 70% formamide in 2×SSC at 74°C. for 5 minutes and slides were dehydrated again as above except in icecold EtOH. Two μg of labeled probe was blocked with 2 μg of human Cot-1DNA in Hybrisol VI (ONCOR, Gaithersburg, Md.). The probe mixture wasdenatured at 74° C. for 5 minutes and then pre-annealed at 37° C. for 15min. Twelve μL of pre-annealed probe was applied per slide, a cover slipwas added and edges were sealed with rubber cement. Slides werehybridized in a darkened, humidified chamber for 16 hr at 37° C.Hybridized slides were then washed in 0.4×SSC containing 0.1% Tween-20at 74° C. for 2 min, followed by 1 min at room temperature in 2×SSC.Slides were allowed to dry in the ark at room temperature and werestained with DAPI (Vector labs, Burlingame, Calif.) for chromosomevisualization.

Summary

In summary, a novel gene which is associated with the Frings phenotypein mice has been isolated and characterized. The gene is known as theMonogenic Audiogenic Seizure-susceptible gene or mass1. The product ofthe mass1 gene is designated MASS1. Nucleic acid molecules that encodefor MASS1 have been identified and purified. The sequence of murinemass1 can be found at SEQ ID NO: 1, and the sequence of human mass1 canbe found at SEQ ID NO: 3. Mammalian genes encoding a MASS1 protein arealso provided. The invention also provides recombinant vectorscomprising nucleic acid molecules that code for a MASS1 protein. Thesevectors can be plasmids. In certain embodiments, the vectors areprokaryotic or eukaryotic expression vectors. The nucleic acid codingfor MASS1 can be linked to a heterologous promoter. The invention alsorelates to transgenic animals in which one or both alleles of theendogenous mass1 gene is mutated.

The invention may be embodied in other specific forms without departingfrom its essential characteristics. The described embodiments are to beconsidered in all respects only as illustrative and not restrictive. Thescope of the invention is, therefore, indicated by the appended claimsrather than by the foregoing description. All changes that come withinthe meaning and range of equivalency of the claims are to be embracedwithin their scope.

33 1 9437 DNA Mus musculus 1 aatgaacatg gcattggtgg tgtggtatga gccaacagtattgaatattc tgcatgtgtc 60 aggaggaagg aagaactctt gataatatag tcacaaacctttgagacagc tctcctagct 120 ctatgaatag atggttctga cattgcaccc ccagagatgtccactgctgt atacatgtct 180 gcactcaatg cttcccttat ccttataccc tgtgtttcagccaccaccca cggtggcatg 240 tttcaaagct gaagttctcc ctgtttcact ttttttggttctgaaagtca ttaacagctg 300 tatgtcttat gtgaccttct gcctgatgcc gaggcaggtgtgcatgacaa gtggtcctag 360 ggagccggct tgccccgatg cttagcttat ttttgtgacctcctgggccc tgtgagcatt 420 ttaatctatc atcttttagc tgagtagcct tcaagttcaagattcctcag agcagatgct 480 ggtagggctg ggaaaacctg tttgatgcag gctttgtttttctttacact gcttttctac 540 attctcattt aaaaaaatca tctatagtat attggtgctaggaatacaca ctgtaagagt 600 acaatctgag ctgatgtgct gtggcattta gcgtttctagggcggtactt ttaccaagtc 660 ctccctctct ctgattgatc aatgcctgat tgtctctgctcttctcaata gccctcatca 720 atctcggtga ttgagccaag gagcagaaat gcatctgtacctcttactct catcagagaa 780 aaagggacct atggaatggt caccgtgact tttgatgtatcaggtggccc aaatccccct 840 gaggaagact tgaatccagt tagaggaaat atcaccttcccacctggcag agcaactgtg 900 atttacaacg tgacagttct tgatgatgag gtaccagaaaatgatgaact atttttgatt 960 caactgagaa gtgtagaagg aggagcagag attaatgcttctaggagctc ggttgaaatc 1020 attgtgaaga aaaatgatag tcctgtgaac ttcatgcagagtgtttacgt ggttcccgag 1080 gacgaccacg tactcactat tcccgtgctt cgtgggaaggatagtgatgg aaatctcatt 1140 ggatctgatg aaacccaagt gtcaatcaga tacaaagtaatgacttggga ttcaacagca 1200 catgcccagc aaaacgttga ctttattgat cttcagccggatactactct tgtctttccc 1260 ccttttgttc atgaatcaca cctgaaattt cagataatcgatgaccttat acccgagata 1320 gctgagtcat ttcacatcat gttactaaag aacaccttacagggagatgc tgtgctaatg 1380 ggcccttcta cagtacaggt caccattaag ccaaatgacaagccctatgg agttctttca 1440 ttcaatagta ttttgtttga aagaccagtt ataattgatgaagatacagc atccagttct 1500 agatttgaag aaattgcagt ggttagaaat ggtggcacacatgggaatgt ctctgtgagc 1560 tgggtgttga cacggaacag cagtgatccc tcaccagtgaccgcagacat cacccctgct 1620 tctgggactc tgcagttcgc acaagggcag atgctggcgccaatttctct agtggtcttt 1680 gacgatgatc ttccagaaga ggctgaagct tacttacttacaatcttgcc tcacaccata 1740 caaggaggcg ctgaagtgag cgagccagcg cagcttctgttctacattca ggacagcgat 1800 aatgtttatg gagaaatagc cttttttcct ggggaaagccagaagattga aagcagccct 1860 agtgagcgat ccttatccct gagtttggcg agacgtgggggaagtaaagg agacgtgagg 1920 gtgatttatt ctgcacttta tattcctgct ggagctatggaccccttgcg agcaaaagat 1980 ggcatcttaa atacatctag gagaagcagc ctccttttcccagaacagaa ccaacaagtt 2040 tctataaaat taccgataag gaatgatgca ttcctccagaatggggccca cttcctagtg 2100 cagttggaag ctgtggtgtt ggtgaacata ttccctccgattccaccagt aagtcccaga 2160 ttcggagaaa tcagaaatat ttcattactg gttaccccagccattgcaaa tggagaaatt 2220 ggctttctta gcaaccttcc aattattttg catgaacccaaagattcttc tgctgaggtg 2280 gtatctatcc ccttgcatcg agatggaact gatggccaggctaccgtgta ctggagtttg 2340 cggccctctg gctttaattc aaaagcagtg actttggatgacgcaggtcc ttttaatggc 2400 tctgttgtgt ttttatctgg acaaaacgaa acatcaatcaacattactgt caaaggcgat 2460 gacataccgg agttgaatga aactgtaacc ctttctctagatagggtgag cgtggacagt 2520 gacgtcctaa aatcaggcta tactagccga gacttgattattttggaaaa tgatgaccct 2580 ggaggcattt ttgaattttc ttatgattct agaggaccctatgttataaa agaaggagat 2640 gccgtggagc tccggattac tcggtccagg gggtcgcttgttaaacagtt cctccgcttt 2700 cacgtggaac ccagagagag caatgaattc tatggaaacatgggggtgct agaattcacc 2760 ccaggagaac gggaagtagt gatcaccctc ctcaccagactggatggcac accagagttg 2820 gacgagcact tctgggcgat cctcagcagc catggtgagagagagagcaa gctgggccgt 2880 gctacactcg tcaacataac gattctcaaa aacgactatcctcatgggat tatagaattt 2940 gtttccgatg gtttgagtgc atcgataaaa gagagcaaaggggaggatat ctatcatgct 3000 gtttatggtg taatacgaac tcgaggcaac tttggtgctgttaatgtatc atggatggtt 3060 agtccagact ttacgcaaga tgtatttcct gtgcaaggaactgtttgttt tggagaccaa 3120 gaatttttta aaaacatcac tgtctactcc cttgtagatgaaattccaga ggagatggaa 3180 gaattcacca ttatcctact taatgccact ggaggagctcaaacagggat caggacaact 3240 gcctccctga ggattctcag gaacgatgac cccgtttactttgcagagcc ttgtgttttg 3300 agggtccagg agggtgagac tgccaacttt acagttctcagaaatggatc tgttgacggg 3360 gcctgcactg tccagtatgc taccgtggat gggaaggcttcaggagaaga gggagacttc 3420 gctcctgtgg agaagggaga aactcttgtg tttgaagttggaagcagaga gcagagtata 3480 tctgtacatg tcaaggatga cggaatccca gaaacagatgagccttttta tatagtcctg 3540 ttcaactcaa caggtgacac agtggtttat gagtacggggtagctacagt cataattgaa 3600 gccaacgatg acccaaatgg tgttttctct ctggagcccatagacaaagc agtggaagaa 3660 ggaaagacaa atgcattttg gattttacgg caccgaggacacttcggcaa tgtttctgtg 3720 gcttggcagc tgttccagaa tgcttctctg cagcctggacaagagttcta tgaaacatca 3780 gggactgtta acttcacaga tggaaaagaa acaaaaccagtcattctccg tgctttccca 3840 gataggattc ctgaattcaa tgaattttat attctaaggcttgtaaatat ttcaggtcct 3900 ggaggtcaac tagcagaaac caactttcag gtgacagtcatgattccatt caatgacgat 3960 ccgtttggaa ttttcatctt agatccagag tgtctagagagagaagtagc tgaagatgtc 4020 ctctcagaag acgacatgtc ttacatcacc agcttcaccattttgagaca acagggtgtc 4080 tttggtgatg tacgggttgg ctgggaagtc ctgtccagagagtttactgc tggccttcca 4140 ccaatgatag actttatact gctaggaagt tttccaagcactgtgccttt gcaaccacat 4200 atgcgacgtc accacagtgg aacagacgtc ctgtacttcagtggactaga gggtgcattt 4260 gggactgttg atcccaagta ccaacccttc agaaataacacaattgccaa ctttacgttt 4320 tcagcttggg taatgcctaa tgccaacaca aatgggtttctcatagcaaa ggatgacagt 4380 catggtagca tctactatgg agtaaaaatc caaacaaatgaaacccacgt gaccctttcc 4440 cttcattata aaacttttgg atcaaatgtt acatatattgccaagagcac tgtcatgaaa 4500 tatttagagg aaggtgtttg gcttcatgtt ttaatcatcttagatgatgg cataattgaa 4560 ttctatctgg acggaaaggc aatgcccaga ggcataaagagtctgaaagg agaagctatt 4620 actgatggtc ctgggatcct gagaattgga gcagggatggatggtggtgc cagattcaca 4680 ggttggatgc aggatgtgag gacctatgag cgcaagctgactcccgagga gatttacgaa 4740 cttcatgctg tgcctgcaag gactgattta cacccgatttctgggtatct ggagttcaga 4800 caaggagaaa gtaacaagtc gttcattgtt gctgcaagagatgacagtga agaggaagga 4860 gaagaattat tccttcttaa gctggtctct gtggatggtggggctcagat ttctaaggaa 4920 aacactactg ctcggctaag aatacagaaa agtgacaatgccaatggcct gtttggcttc 4980 actggggctt gtataccaga gatgacagag gaggggtccactgtttcctg tgtggttgag 5040 cgaacgaggg gagctctggg ttacgtgcat gttttctacaccatctccca gatcgagtca 5100 gaaggcatca attacctcgt tgatgatttt gccaatgccagtggcactat caccttcttg 5160 ccttggcagc ggtctgaggt cctgaatctg tacgttcttgatgaggacat gcctgagcta 5220 aatgaatatt ttcgggtgac gttggtgtct gcagttccaggagatggaaa acttggttca 5280 actcccatca gtggtgccag catagatcct gagaaggaaaccacaggcat cactgtcaaa 5340 gctagtgacc atccttacgg cttgatgcag ttctccacagggttgcctcc tcagcctgaa 5400 gattcaatga gtctgcctgc tagcagtgtg ccacatatcacagtgcagga agaggatggc 5460 gaaatccgtt tactggtcat tcgtgcacaa gggctccttggtcgggtgac tgtaggattt 5520 agaacagtat ccctgacagc atttagtcca gaggactaccagagcactgc tggcacatta 5580 gaatttcaat caggagaaag atataaatat atatttgtcaacatcactga taattccatc 5640 cctgaactgg aaaaatcttt taaagttgag ttgttaaacttggatggagg agtgtctgac 5700 ctctttaggg ttgatggcag tgggagtgga gaagcggacacggatttctt ccttccacct 5760 gtcctcccgc atgccagtct aggagtggct tcccagattctggtgaccat tgctgcctct 5820 gaccatgctc atggggtgtt tgaattcagc cctgaatcactcttcgtcag tggaactgaa 5880 ccagaggatg gatacagtac tgtcgtgtta aatgttacacggactcgggg agccctgtct 5940 gcagtgactt tgcaatggaa ggtagactcg gacctggatggggatctcgc cattacatct 6000 ggcaacatca catttgagac tgggcagagg attgcttccatcactgtgga gatactgtca 6060 gatgaagagc cagagctaga caaggcactc accgtgtcgatcctcaacgt gtccagtggc 6120 tccttgggag ttcttacaaa tgccacattg acaattttggctagtgatga tccttatggg 6180 gtctttattt ttcctaacaa aactagacct ttgagtgttgaagaagcaac ccagaatgtc 6240 acattatcga taataaggtt gaaaggcctc atgggagaagttgcagtctc atatgcaacc 6300 atagatgata tggaaaagcc accgtatttc ccacctaatttagctagagc aactcaagga 6360 ggagattaca tatcagcatc tggattggct cttttcagagctaatcagac tgaggcaaca 6420 atcactattt caatcctaga tgatgctgaa ccagaacgctcagaatctgt gttcattgaa 6480 cttttcaatt cctctttagt agacaaagta cagaatcgcccaatcccaca ttctccacgc 6540 cttgggccta aggtggagac tgtggcccat ctcgttattgttgccaatga cgatgcattt 6600 ggaactgtgc agctgtctgc aacatctgtt catgtagcagaaaatcatgt tggacccatt 6660 atcaatgtga ctcgaactgg aggaacattt gcagatgtttctgttaagtt taaagctgtg 6720 ccaataactg cagcagcggg tgaggactat agtatagcatcttcagacgt ggtcttgctg 6780 gaaggggaaa ccactaaagc tgtgccaata tatatcattaacgacatcta ccctgagctg 6840 gaagaaacct ttcttgtgca gctactaaac gaaacaacaggtggagccac actggggcct 6900 ctgagagagg cagtcattac catagaggcg tctgatgacccctacggact gtttggtttt 6960 cagaatacta aatttattgt agaggaacct gagtttaactcagtgagggt aaacgtgcca 7020 ataattcgaa attctgggac actcggcaat gttactgttcaatgggttgc catcattaat 7080 ggacagtttg ctactggcga cctgcgagtt gtctcaggtaatgtgacctt tgcccctggg 7140 gaaaccattc aaaccttgtt gttagaggtc ctggctgacgacgttccgga gattgaagag 7200 gttgtccagg tgcaactagc tgctgcctct ggcggaggtacaattgggtt agatcgagtg 7260 gcaaatattg ttattcctgc caatgataac ccttacggttcagtagcctt tgttcagtcc 7320 gtttttcgtg tccaagagcc tctagagaga agttcctatgctaacataac tgtcaggaga 7380 agcggaggac actttggtcg cctgctgttg tgctatggtacttctgatat tgatgtagtg 7440 gctcgtgcag ttgaggaagg tgaagatgtg ttatcctactatgaatcacc gactcaaggg 7500 gtgcccgacc cactctggag aacttgggtg aacgtgtctgcagtggagga gacacagtat 7560 acctgtgcca ctttgtgtct caaagaacgt gcctgctcagcgttttcagt tgtcagtggt 7620 gccgagggcc ctcggtgctt ctggatgacg tcgtgggtcagcggaactgt gaacagctct 7680 gacttccaaa cctacaagaa gaacatgact agggtggcctctcttttcag tggccaggca 7740 gttgctggta gtgactacga gcctgtgaca agacagtgggccgtgatact ggaaggtgat 7800 gagtttgcaa atctcactgt ttctgtactt cctgacgatgctcccgagat ggatgaaagt 7860 ttcctaattt ctctccttga agttcacctt atgaacatctcagacagttt taaaaaccag 7920 ccaaccatag gacatccgaa tacttccgct gtggtcataggactgaatgg cgatgccttt 7980 ggagtattca ttatctacag tgttagtccc aatacctcggaagatggctt atgtgtggaa 8040 gtgcaggaac agccacaaac ttctgtggaa ctggttatctacaggacagg aggcagcctg 8100 gggcaggtca tggtcgaatg gcgcgttgtt ggtggaacggctactgaagg tttagatttt 8160 atgggtgctg gagacattct tacttttgca gaaggtgaaaccaaaaagat ggccatttta 8220 accattttgg atgattctga gccagaggac aatgaaagcatccttgtccg tctggtggcc 8280 acagagggcg gaagcagaat cctgcccagc tcagacaccgtgacagtcaa catcttggca 8340 aacgacaatg tggcaggaat tgtcagcttt cagacagcttccagatctgt cataggccac 8400 gaaggagaaa tgttgcagtt ccatgtggta agaacacccccaggtcgagg aaatgtcact 8460 gtcaactgga aagttgttgg acaaaatcta gaagtcaattttgctaactt tacgggccaa 8520 ctcttcttct ctgagggtac attgaataaa acaatatttgtacatttgtt ggatgacaat 8580 attcctgagg agaaagaagt ataccaggtt gttctgtatgatgtcaagac ccaaggagtg 8640 tcgccagcag gagttgctct acttgatgcc cagggatatgcagctgtact gacagtggaa 8700 gcaagcgatg agccacacgg tgttttaaac tttgctctctcctcaagatt tgttgtgctc 8760 caggaggcta atgtaacaat tcagctcttc gtcaacagagagttcggatc tctaggagcc 8820 atcaatgtca cgtatgctac tgttcctgga atagtaagtctgaaaaacaa cacagaaggc 8880 aacctagcag agccagagtc tgacttcatc cctgtggtgggctctctggt tttggaggaa 8940 ggggaaacaa cagcagctat cagtatcact gtcctcgaggatgatatacc agagctaaaa 9000 gaatatttct tggtgaattt aactcatgtt gatctcattatggctcctct gacttcatct 9060 cctcccagac taggtatggg gctctccttt atgaaccttttgactaactg tgagagtcag 9120 aggacttcat tgttttaatc agagtgagtt gttatgggaacgtaacaccg ccccttgttt 9180 tgtttgctaa tttcagccat gtgtgaggat gtgatgagcatttagacttg ttctagttag 9240 agactgtcat tgtaagcagt gtaaggcaat aattactctggtgcttttta aattttacaa 9300 ctatgttact gccagatatg caacctgcaa ggtggtattacttttttcaa atgtattttt 9360 ccttcatttt cttttaaaat gtaactagct atcttcataagtcaacagtt ttcttttaag 9420 tttaatattt attttgt 9437 2 2780 PRT Musmusculus 2 Met Val Thr Val Thr Phe Asp Val Ser Gly Gly Pro Asn Pro ProGlu 1 5 10 15 Glu Asp Leu Asn Pro Val Arg Gly Asn Ile Thr Phe Pro ProGly Arg 20 25 30 Ala Thr Val Ile Tyr Asn Val Thr Val Leu Asp Asp Glu ValPro Glu 35 40 45 Asn Asp Glu Leu Phe Leu Ile Gln Leu Arg Ser Val Glu GlyGly Ala 50 55 60 Glu Ile Asn Ala Ser Arg Ser Ser Val Glu Ile Ile Val LysLys Asn 65 70 75 80 Asp Ser Pro Val Asn Phe Met Gln Ser Val Tyr Val ValPro Glu Asp 85 90 95 Asp His Val Leu Thr Ile Pro Val Leu Arg Gly Lys AspSer Asp Gly 100 105 110 Asn Leu Ile Gly Ser Asp Glu Thr Gln Val Ser IleArg Tyr Lys Val 115 120 125 Met Thr Trp Asp Ser Thr Ala His Ala Gln GlnAsn Val Asp Phe Ile 130 135 140 Asp Leu Gln Pro Asp Thr Thr Leu Val PhePro Pro Phe Val His Glu 145 150 155 160 Ser His Leu Lys Phe Gln Ile IleAsp Asp Leu Ile Pro Glu Ile Ala 165 170 175 Glu Ser Phe His Ile Met LeuLeu Lys Asn Thr Leu Gln Gly Asp Ala 180 185 190 Val Leu Met Gly Pro SerThr Val Gln Val Thr Ile Lys Pro Asn Asp 195 200 205 Lys Pro Tyr Gly ValLeu Ser Phe Asn Ser Ile Leu Phe Glu Arg Pro 210 215 220 Val Ile Ile AspGlu Asp Thr Ala Ser Ser Ser Arg Phe Glu Glu Ile 225 230 235 240 Ala ValVal Arg Asn Gly Gly Thr His Gly Asn Val Ser Val Ser Trp 245 250 255 ValLeu Thr Arg Asn Ser Ser Asp Pro Ser Pro Val Thr Ala Asp Ile 260 265 270Thr Pro Ala Ser Gly Thr Leu Gln Phe Ala Gln Gly Gln Met Leu Ala 275 280285 Pro Ile Ser Leu Val Val Phe Asp Asp Asp Leu Pro Glu Glu Ala Glu 290295 300 Ala Tyr Leu Leu Thr Ile Leu Pro His Thr Ile Gln Gly Gly Ala Glu305 310 315 320 Val Ser Glu Pro Ala Gln Leu Leu Phe Tyr Ile Gln Asp SerAsp Asn 325 330 335 Val Tyr Gly Glu Ile Ala Phe Phe Pro Gly Glu Ser GlnLys Ile Glu 340 345 350 Ser Ser Pro Ser Glu Arg Ser Leu Ser Leu Ser LeuAla Arg Arg Gly 355 360 365 Gly Ser Lys Gly Asp Val Arg Val Ile Tyr SerAla Leu Tyr Ile Pro 370 375 380 Ala Gly Ala Met Asp Pro Leu Arg Ala LysAsp Gly Ile Leu Asn Thr 385 390 395 400 Ser Arg Arg Ser Ser Leu Leu PhePro Glu Gln Asn Gln Gln Val Ser 405 410 415 Ile Lys Leu Pro Ile Arg AsnAsp Ala Phe Leu Gln Asn Gly Ala His 420 425 430 Phe Leu Val Gln Leu GluAla Val Val Leu Val Asn Ile Phe Pro Pro 435 440 445 Ile Pro Pro Val SerPro Arg Phe Gly Glu Ile Arg Asn Ile Ser Leu 450 455 460 Leu Val Thr ProAla Ile Ala Asn Gly Glu Ile Gly Phe Leu Ser Asn 465 470 475 480 Leu ProIle Ile Leu His Glu Pro Lys Asp Ser Ser Ala Glu Val Val 485 490 495 SerIle Pro Leu His Arg Asp Gly Thr Asp Gly Gln Ala Thr Val Tyr 500 505 510Trp Ser Leu Arg Pro Ser Gly Phe Asn Ser Lys Ala Val Thr Leu Asp 515 520525 Asp Ala Gly Pro Phe Asn Gly Ser Val Val Phe Leu Ser Gly Gln Asn 530535 540 Glu Thr Ser Ile Asn Ile Thr Val Lys Gly Asp Asp Ile Pro Glu Leu545 550 555 560 Asn Glu Thr Val Thr Leu Ser Leu Asp Arg Val Ser Val AspSer Asp 565 570 575 Val Leu Lys Ser Gly Tyr Thr Ser Arg Asp Leu Ile IleLeu Glu Asn 580 585 590 Asp Asp Pro Gly Gly Ile Phe Glu Phe Ser Tyr AspSer Arg Gly Pro 595 600 605 Tyr Val Ile Lys Glu Gly Asp Ala Val Glu LeuArg Ile Thr Arg Ser 610 615 620 Arg Gly Ser Leu Val Lys Gln Phe Leu ArgPhe His Val Glu Pro Arg 625 630 635 640 Glu Ser Asn Glu Phe Tyr Gly AsnMet Gly Val Leu Glu Phe Thr Pro 645 650 655 Gly Glu Arg Glu Val Val IleThr Leu Leu Thr Arg Leu Asp Gly Thr 660 665 670 Pro Glu Leu Asp Glu HisPhe Trp Ala Ile Leu Ser Ser His Gly Glu 675 680 685 Arg Glu Ser Lys LeuGly Arg Ala Thr Leu Val Asn Ile Thr Ile Leu 690 695 700 Lys Asn Asp TyrPro His Gly Ile Ile Glu Phe Val Ser Asp Gly Leu 705 710 715 720 Ser AlaSer Ile Lys Glu Ser Lys Gly Glu Asp Ile Tyr His Ala Val 725 730 735 TyrGly Val Ile Arg Thr Arg Gly Asn Phe Gly Ala Val Asn Val Ser 740 745 750Trp Met Val Ser Pro Asp Phe Thr Gln Asp Val Phe Pro Val Gln Gly 755 760765 Thr Val Cys Phe Gly Asp Gln Glu Phe Phe Lys Asn Ile Thr Val Tyr 770775 780 Ser Leu Val Asp Glu Ile Pro Glu Glu Met Glu Glu Phe Thr Ile Ile785 790 795 800 Leu Leu Asn Ala Thr Gly Gly Ala Gln Thr Gly Ile Arg ThrThr Ala 805 810 815 Ser Leu Arg Ile Leu Arg Asn Asp Asp Pro Val Tyr PheAla Glu Pro 820 825 830 Cys Val Leu Arg Val Gln Glu Gly Glu Thr Ala AsnPhe Thr Val Leu 835 840 845 Arg Asn Gly Ser Val Asp Gly Ala Cys Thr ValGln Tyr Ala Thr Val 850 855 860 Asp Gly Lys Ala Ser Gly Glu Glu Gly AspPhe Ala Pro Val Glu Lys 865 870 875 880 Gly Glu Thr Leu Val Phe Glu ValGly Ser Arg Glu Gln Ser Ile Ser 885 890 895 Val His Val Lys Asp Asp GlyIle Pro Glu Thr Asp Glu Pro Phe Tyr 900 905 910 Ile Val Leu Phe Asn SerThr Gly Asp Thr Val Val Tyr Glu Tyr Gly 915 920 925 Val Ala Thr Val IleIle Glu Ala Asn Asp Asp Pro Asn Gly Val Phe 930 935 940 Ser Leu Glu ProIle Asp Lys Ala Val Glu Glu Gly Lys Thr Asn Ala 945 950 955 960 Phe TrpIle Leu Arg His Arg Gly His Phe Gly Asn Val Ser Val Ala 965 970 975 TrpGln Leu Phe Gln Asn Ala Ser Leu Gln Pro Gly Gln Glu Phe Tyr 980 985 990Glu Thr Ser Gly Thr Val Asn Phe Thr Asp Gly Lys Glu Thr Lys Pro 995 10001005 Val Ile Leu Arg Ala Phe Pro Asp Arg Ile Pro Glu Phe Asn Glu 10101015 1020 Phe Tyr Ile Leu Arg Leu Val Asn Ile Ser Gly Pro Gly Gly Gln1025 1030 1035 Leu Ala Glu Thr Asn Phe Gln Val Thr Val Met Ile Pro PheAsn 1040 1045 1050 Asp Asp Pro Phe Gly Ile Phe Ile Leu Asp Pro Glu CysLeu Glu 1055 1060 1065 Arg Glu Val Ala Glu Asp Val Leu Ser Glu Asp AspMet Ser Tyr 1070 1075 1080 Ile Thr Ser Phe Thr Ile Leu Arg Gln Gln GlyVal Phe Gly Asp 1085 1090 1095 Val Arg Val Gly Trp Glu Val Leu Ser ArgGlu Phe Thr Ala Gly 1100 1105 1110 Leu Pro Pro Met Ile Asp Phe Ile LeuLeu Gly Ser Phe Pro Ser 1115 1120 1125 Thr Val Pro Leu Gln Pro His MetArg Arg His His Ser Gly Thr 1130 1135 1140 Asp Val Leu Tyr Phe Ser GlyLeu Glu Gly Ala Phe Gly Thr Val 1145 1150 1155 Asp Pro Lys Tyr Gln ProPhe Arg Asn Asn Thr Ile Ala Asn Phe 1160 1165 1170 Thr Phe Ser Ala TrpVal Met Pro Asn Ala Asn Thr Asn Gly Phe 1175 1180 1185 Leu Ile Ala LysAsp Asp Ser His Gly Ser Ile Tyr Tyr Gly Val 1190 1195 1200 Lys Ile GlnThr Asn Glu Thr His Val Thr Leu Ser Leu His Tyr 1205 1210 1215 Lys ThrPhe Gly Ser Asn Val Thr Tyr Ile Ala Lys Ser Thr Val 1220 1225 1230 MetLys Tyr Leu Glu Glu Gly Val Trp Leu His Val Leu Ile Ile 1235 1240 1245Leu Asp Asp Gly Ile Ile Glu Phe Tyr Leu Asp Gly Lys Ala Met 1250 12551260 Pro Arg Gly Ile Lys Ser Leu Lys Gly Glu Ala Ile Thr Asp Gly 12651270 1275 Pro Gly Ile Leu Arg Ile Gly Ala Gly Met Asp Gly Gly Ala Arg1280 1285 1290 Phe Thr Gly Trp Met Gln Asp Val Arg Thr Tyr Glu Arg LysLeu 1295 1300 1305 Thr Pro Glu Glu Ile Tyr Glu Leu His Ala Val Pro AlaArg Thr 1310 1315 1320 Asp Leu His Pro Ile Ser Gly Tyr Leu Glu Phe ArgGln Gly Glu 1325 1330 1335 Ser Asn Lys Ser Phe Ile Val Ala Ala Arg AspAsp Ser Glu Glu 1340 1345 1350 Glu Gly Glu Glu Leu Phe Leu Leu Lys LeuVal Ser Val Asp Gly 1355 1360 1365 Gly Ala Gln Ile Ser Lys Glu Asn ThrThr Ala Arg Leu Arg Ile 1370 1375 1380 Gln Lys Ser Asp Asn Ala Asn GlyLeu Phe Gly Phe Thr Gly Ala 1385 1390 1395 Cys Ile Pro Glu Met Thr GluGlu Gly Ser Thr Val Ser Cys Val 1400 1405 1410 Val Glu Arg Thr Arg GlyAla Leu Gly Tyr Val His Val Phe Tyr 1415 1420 1425 Thr Ile Ser Gln IleGlu Ser Glu Gly Ile Asn Tyr Leu Val Asp 1430 1435 1440 Asp Phe Ala AsnAla Ser Gly Thr Ile Thr Phe Leu Pro Trp Gln 1445 1450 1455 Arg Ser GluVal Leu Asn Leu Tyr Val Leu Asp Glu Asp Met Pro 1460 1465 1470 Glu LeuAsn Glu Tyr Phe Arg Val Thr Leu Val Ser Ala Val Pro 1475 1480 1485 GlyAsp Gly Lys Leu Gly Ser Thr Pro Ile Ser Gly Ala Ser Ile 1490 1495 1500Asp Pro Glu Lys Glu Thr Thr Gly Ile Thr Val Lys Ala Ser Asp 1505 15101515 His Pro Tyr Gly Leu Met Gln Phe Ser Thr Gly Leu Pro Pro Gln 15201525 1530 Pro Glu Asp Ser Met Ser Leu Pro Ala Ser Ser Val Pro His Ile1535 1540 1545 Thr Val Gln Glu Glu Asp Gly Glu Ile Arg Leu Leu Val IleArg 1550 1555 1560 Ala Gln Gly Leu Leu Gly Arg Val Thr Val Gly Phe ArgThr Val 1565 1570 1575 Ser Leu Thr Ala Phe Ser Pro Glu Asp Tyr Gln SerThr Ala Gly 1580 1585 1590 Thr Leu Glu Phe Gln Ser Gly Glu Arg Tyr LysTyr Ile Phe Val 1595 1600 1605 Asn Ile Thr Asp Asn Ser Ile Pro Glu LeuGlu Lys Ser Phe Lys 1610 1615 1620 Val Glu Leu Leu Asn Leu Asp Gly GlyVal Ser Asp Leu Phe Arg 1625 1630 1635 Val Asp Gly Ser Gly Ser Gly GluAla Asp Thr Asp Phe Phe Leu 1640 1645 1650 Pro Pro Val Leu Pro His AlaSer Leu Gly Val Ala Ser Gln Ile 1655 1660 1665 Leu Val Thr Ile Ala AlaSer Asp His Ala His Gly Val Phe Glu 1670 1675 1680 Phe Ser Pro Glu SerLeu Phe Val Ser Gly Thr Glu Pro Glu Asp 1685 1690 1695 Gly Tyr Ser ThrVal Val Leu Asn Val Thr Arg Thr Arg Gly Ala 1700 1705 1710 Leu Ser AlaVal Thr Leu Gln Trp Lys Val Asp Ser Asp Leu Asp 1715 1720 1725 Gly AspLeu Ala Ile Thr Ser Gly Asn Ile Thr Phe Glu Thr Gly 1730 1735 1740 GlnArg Ile Ala Ser Ile Thr Val Glu Ile Leu Ser Asp Glu Glu 1745 1750 1755Pro Glu Leu Asp Lys Ala Leu Thr Val Ser Ile Leu Asn Val Ser 1760 17651770 Ser Gly Ser Leu Gly Val Leu Thr Asn Ala Thr Leu Thr Ile Leu 17751780 1785 Ala Ser Asp Asp Pro Tyr Gly Val Phe Ile Phe Pro Asn Lys Thr1790 1795 1800 Arg Pro Leu Ser Val Glu Glu Ala Thr Gln Asn Val Thr LeuSer 1805 1810 1815 Ile Ile Arg Leu Lys Gly Leu Met Gly Glu Val Ala ValSer Tyr 1820 1825 1830 Ala Thr Ile Asp Asp Met Glu Lys Pro Pro Tyr PhePro Pro Asn 1835 1840 1845 Leu Ala Arg Ala Thr Gln Gly Gly Asp Tyr IleSer Ala Ser Gly 1850 1855 1860 Leu Ala Leu Phe Arg Ala Asn Gln Thr GluAla Thr Ile Thr Ile 1865 1870 1875 Ser Ile Leu Asp Asp Ala Glu Pro GluArg Ser Glu Ser Val Phe 1880 1885 1890 Ile Glu Leu Phe Asn Ser Ser LeuVal Asp Lys Val Gln Asn Arg 1895 1900 1905 Pro Ile Pro His Ser Pro ArgLeu Gly Pro Lys Val Glu Thr Val 1910 1915 1920 Ala His Leu Val Ile ValAla Asn Asp Asp Ala Phe Gly Thr Val 1925 1930 1935 Gln Leu Ser Ala ThrSer Val His Val Ala Glu Asn His Val Gly 1940 1945 1950 Pro Ile Ile AsnVal Thr Arg Thr Gly Gly Thr Phe Ala Asp Val 1955 1960 1965 Ser Val LysPhe Lys Ala Val Pro Ile Thr Ala Ala Ala Gly Glu 1970 1975 1980 Asp TyrSer Ile Ala Ser Ser Asp Val Val Leu Leu Glu Gly Glu 1985 1990 1995 ThrThr Lys Ala Val Pro Ile Tyr Ile Ile Asn Asp Ile Tyr Pro 2000 2005 2010Glu Leu Glu Glu Thr Phe Leu Val Gln Leu Leu Asn Glu Thr Thr 2015 20202025 Gly Gly Ala Thr Leu Gly Pro Leu Arg Glu Ala Val Ile Thr Ile 20302035 2040 Glu Ala Ser Asp Asp Pro Tyr Gly Leu Phe Gly Phe Gln Asn Thr2045 2050 2055 Lys Phe Ile Val Glu Glu Pro Glu Phe Asn Ser Val Arg ValAsn 2060 2065 2070 Val Pro Ile Ile Arg Asn Ser Gly Thr Leu Gly Asn ValThr Val 2075 2080 2085 Gln Trp Val Ala Ile Ile Asn Gly Gln Phe Ala ThrGly Asp Leu 2090 2095 2100 Arg Val Val Ser Gly Asn Val Thr Phe Ala ProGly Glu Thr Ile 2105 2110 2115 Gln Thr Leu Leu Leu Glu Val Leu Ala AspAsp Val Pro Glu Ile 2120 2125 2130 Glu Glu Val Val Gln Val Gln Leu AlaAla Ala Ser Gly Gly Gly 2135 2140 2145 Thr Ile Gly Leu Asp Arg Val AlaAsn Ile Val Ile Pro Ala Asn 2150 2155 2160 Asp Asn Pro Tyr Gly Ser ValAla Phe Val Gln Ser Val Phe Arg 2165 2170 2175 Val Gln Glu Pro Leu GluArg Ser Ser Tyr Ala Asn Ile Thr Val 2180 2185 2190 Arg Arg Ser Gly GlyHis Phe Gly Arg Leu Leu Leu Cys Tyr Gly 2195 2200 2205 Thr Ser Asp IleAsp Val Val Ala Arg Ala Val Glu Glu Gly Glu 2210 2215 2220 Asp Val LeuSer Tyr Tyr Glu Ser Pro Thr Gln Gly Val Pro Asp 2225 2230 2235 Pro LeuTrp Arg Thr Trp Val Asn Val Ser Ala Val Glu Glu Thr 2240 2245 2250 GlnTyr Thr Cys Ala Thr Leu Cys Leu Lys Glu Arg Ala Cys Ser 2255 2260 2265Ala Phe Ser Val Val Ser Gly Ala Glu Gly Pro Arg Cys Phe Trp 2270 22752280 Met Thr Ser Trp Val Ser Gly Thr Val Asn Ser Ser Asp Phe Gln 22852290 2295 Thr Tyr Lys Lys Asn Met Thr Arg Val Ala Ser Leu Phe Ser Gly2300 2305 2310 Gln Ala Val Ala Gly Ser Asp Tyr Glu Pro Val Thr Arg GlnTrp 2315 2320 2325 Ala Val Ile Leu Glu Gly Asp Glu Phe Ala Asn Leu ThrVal Ser 2330 2335 2340 Val Leu Pro Asp Asp Ala Pro Glu Met Asp Glu SerPhe Leu Ile 2345 2350 2355 Ser Leu Leu Glu Val His Leu Met Asn Ile SerAsp Ser Phe Lys 2360 2365 2370 Asn Gln Pro Thr Ile Gly His Pro Asn ThrSer Ala Val Val Ile 2375 2380 2385 Gly Leu Asn Gly Asp Ala Phe Gly ValPhe Ile Ile Tyr Ser Val 2390 2395 2400 Ser Pro Asn Thr Ser Glu Asp GlyLeu Cys Val Glu Val Gln Glu 2405 2410 2415 Gln Pro Gln Thr Ser Val GluLeu Val Ile Tyr Arg Thr Gly Gly 2420 2425 2430 Ser Leu Gly Gln Val MetVal Glu Trp Arg Val Val Gly Gly Thr 2435 2440 2445 Ala Thr Glu Gly LeuAsp Phe Met Gly Ala Gly Asp Ile Leu Thr 2450 2455 2460 Phe Ala Glu GlyGlu Thr Lys Lys Met Ala Ile Leu Thr Ile Leu 2465 2470 2475 Asp Asp SerGlu Pro Glu Asp Asn Glu Ser Ile Leu Val Arg Leu 2480 2485 2490 Val AlaThr Glu Gly Gly Ser Arg Ile Leu Pro Ser Ser Asp Thr 2495 2500 2505 ValThr Val Asn Ile Leu Ala Asn Asp Asn Val Ala Gly Ile Val 2510 2515 2520Ser Phe Gln Thr Ala Ser Arg Ser Val Ile Gly His Glu Gly Glu 2525 25302535 Met Leu Gln Phe His Val Val Arg Thr Pro Pro Gly Arg Gly Asn 25402545 2550 Val Thr Val Asn Trp Lys Val Val Gly Gln Asn Leu Glu Val Asn2555 2560 2565 Phe Ala Asn Phe Thr Gly Gln Leu Phe Phe Ser Glu Gly ThrLeu 2570 2575 2580 Asn Lys Thr Ile Phe Val His Leu Leu Asp Asp Asn IlePro Glu 2585 2590 2595 Glu Lys Glu Val Tyr Gln Val Val Leu Tyr Asp ValLys Thr Gln 2600 2605 2610 Gly Val Ser Pro Ala Gly Val Ala Leu Leu AspAla Gln Gly Tyr 2615 2620 2625 Ala Ala Val Leu Thr Val Glu Ala Ser AspGlu Pro His Gly Val 2630 2635 2640 Leu Asn Phe Ala Leu Ser Ser Arg PheVal Val Leu Gln Glu Ala 2645 2650 2655 Asn Val Thr Ile Gln Leu Phe ValAsn Arg Glu Phe Gly Ser Leu 2660 2665 2670 Gly Ala Ile Asn Val Thr TyrAla Thr Val Pro Gly Ile Val Ser 2675 2680 2685 Leu Lys Asn Asn Thr GluGly Asn Leu Ala Glu Pro Glu Ser Asp 2690 2695 2700 Phe Ile Pro Val ValGly Ser Leu Val Leu Glu Glu Gly Glu Thr 2705 2710 2715 Thr Ala Ala IleSer Ile Thr Val Leu Glu Asp Asp Ile Pro Glu 2720 2725 2730 Leu Lys GluTyr Phe Leu Val Asn Leu Thr His Val Asp Leu Ile 2735 2740 2745 Met AlaPro Leu Thr Ser Ser Pro Pro Arg Leu Gly Met Gly Leu 2750 2755 2760 SerPhe Met Asn Leu Leu Thr Asn Cys Glu Ser Gln Arg Thr Ser 2765 2770 2775Leu Phe 2780 3 9018 DNA Homo sapiens n (585)..(585) wherein n is a, g,c, or t. 3 ctactttatt agtaaatctt ctttcagctt tactcatcct atttgtgtttggagaaacag 60 aaataagatt tacttggaca aactgaattt gttgttaatg aaacaagtacaacagttatt 120 cgtcttatca ttgaaaggat aggagagcca gcaaatgtta ctgcaattgtatcgctgtat 180 ggagaggacg ctggtgactt ttttgacaca tatgctgcag cttttatacctgccggagaa 240 acaaacagaa cagtgtacat agcagtatgt gatgatgact taccagagcctgacgaaact 300 tttatttttc acttaacatt acagaaacct tcagcaaatg tgaagcttggatggccaagg 360 actgttactg tgacaatatt atcaaatgac aatgcatttg gaattatttcatttaatatg 420 cttccctcaa tcgcagtgag tgagcccaag ggcagaaatg agtctatgcctcttactctc 480 atcagggaaa agggaaccta tggaatggtc atggtgactt ttgaggtagagggtggccca 540 aatccccctg atgaagattt gagtccagtt aaaggaaata tcacntttccccctggcaga 600 gcaacagtaa tttataactt gacagtactc gatgacgagg taccagaaaatgatgaaata 660 tttttaattc aactgaaaag tgtagaagga ggagctgaga ttaacacctctaggaattcc 720 attgagatca tcattaagaa aaatgatagt cccgtgagat tccttcagagtatttatttg 780 gttcctgagg aagaccacat actcataatt ccagtagttc gtggaaaggacaacaatgga 840 aatctgattg gatctgatga atatgaggtt tcaatcagtt atgctgtcacaactgggaat 900 tccacagcac atgcccagca aaatctggac ttcattgatc ttcagccaaacacaactgtt 960 gtttttccac cttttattca tgaatctcac ttgaaatttc aaatagttgatgacaccata 1020 ccggagattg ctgaatcgtt tcacattatg ttactaaaag ataccttacagggagatgct 1080 gtgctaataa gcccttctgt tgtacaagtc accattaagc caaatgataaaccttatgga 1140 gtcctttcat tcaacagtgt tttgtttgaa aggacagtta taattgatgaagatagaata 1200 tcaagatatg aagaaatcac agtggttaga aatggaggaa cccatgggaatgtctctgcg 1260 aattgggtgt tgacacggaa cagcactgat ccctcaccag taacagcagatatcagaccg 1320 agctctggag ttctccattt tgcacaaggg cagatgttgg caacaattcctcttactgtg 1380 gttgatgatg atcttccaga agaggcagaa gcttatctac ttcaaattctgcctcataca 1440 atacgaggag gtgcagaagt gagcgagcca gcggagcttt tgttctacattcaggatagt 1500 gatgatgtct atggcctaat aacatttttt cctatggaaa accagaagattgaaagcagc 1560 ccaggtgaac gatacttatc cttgagtttt acaagactag gagggactaaaggagatgtg 1620 aggttgcttt attctgtact ttacattcct gctggagctg tggaccccttgcaagcaaaa 1680 gaaggcatct taaatatatc agggagaaat gacctcattt ttccagagcaaaaaactcaa 1740 gtcactacaa aattaccaat aagaaatgat gcattccttc aaaatggagctcactttcta 1800 gtacagttgg aaactgtgga gttgttaaac ataattcctc taatcccacccataagccct 1860 agatttgggg aaatctgcaa tatttcttta ctggttactc cagccattgcaaatggagaa 1920 attggctttc tcagcaatct tccaattatt ttgcatgaac tagaagattttgctgctgaa 1980 gtggtataca ttcccttaca tcgggatgga actgatggcc aggctactgtctactggagt 2040 ttgaagccct ctggctttaa ttcaaaagca gtgaccccgg atgatataggcccctttaat 2100 ggctctgttt tgtttttatc tgggcaaagt gacacaacaa tcaacattactatcaaaggt 2160 gatgacatac cggaaatgaa tgaaactgta acactttctc tagacagggttaacgtggaa 2220 aaccaagtgc tgaaatctgg atatactagc cgtgacctaa ttattttggaaaatgatgac 2280 cctgggggag tttttgaatt ttctcctgct tccagaggac cctatgttataaaagaagga 2340 gaatctgtag agctccacat catccgatca agggggtccc ttgttaagcagtttctacac 2400 taccgagtag agccaagaga tagcaatgaa ttctatggaa acacgggagtactagaattt 2460 aaacctggag aaagggagat agtgatcacc ttgctagcaa gattggatgggataccagag 2520 ttggatgaac actactgggt ggtcctcagc agccacggag aacgggaaagcaagttggga 2580 agtgccacca ttgtcaatat aacgattctg aaaaatgatg atcctcatggcattatagaa 2640 tttgtttctg atggtctaat tgtgatgata aatgaaagca aaggagatgctatctatagt 2700 gctgtttatg atgtagtaag aaatcgaggc aactttggtg atgttagtgtatcatgggtg 2760 gttagtccag actttacaca agatgtattt cctgtacaag ggactgttgtctttggagat 2820 caggaatttt caaaaaatat caccatttac tcccttccag atgagattccagaagaaatg 2880 gaagaattta ccgttatcct actgaatggc actggaggag ctaaagtgggaaatagaaca 2940 actgcaactc tgaggattag aagaaatgat gaccccattt attttgcagaacctcgtgta 3000 gtgagggttc aggaaggtga gactgccaac tttacagttc tcagaaatggatctgttgat 3060 gtgacttgca tggtccagta tgctaccaag gatgggaagg ctactgcaagagagagagat 3120 ttcattcctg ttgaaaaagg agaaacgctc atttttgagg ttggaagtagacagcagagc 3180 atatccatat ttgttaatga agatggtatc ccggaaacag atgagcccttttatataatc 3240 ctcttgaatt caacaggtga tacagtagta tatcaatatg gagtagctacagtaataatt 3300 gaagctaatg atgacccaaa tggcattttt tctctggagc ccatagacaaagcagtggaa 3360 gaaggaaaga ctaatgcatt ttggattttg aggcaccgag gatactttggtagtgtttct 3420 gtatcttggc agctctttca gaatgattct gctttgcagc ctgggcaggagttctatgaa 3480 acttcaggaa ctgttaactt catggatgga gaagaagcaa aaccaatcattctccatgct 3540 tttccagata aaattcctga attcaatgaa ttttatttcc taaaacttgtaaacatttca 3600 ggtggatccc caggtcctgg gggccagcta gcagaaacca acctccaggtgacagtaatg 3660 gttccattca atgatgatcc ctttggagtt tttatcttgg atccagagtgtttagagaga 3720 gaagtggcag aagatgtcct gtctgaagat gatatgtctt atattaccaacttcaccatt 3780 ttgaggcagc agggtgtgtt tggtgatgta caactgggct gggaaatactgtccagtgag 3840 ttccctgctg gtttgccacc aatgatagat tttttactgg ttggaattttccccaccacc 3900 gtgcatttac aacagcacat gcggcgtcac cacagtggaa cggatgctttgtactttacc 3960 ggactagagg gtgcatttgg gactgttaat ccaaaatacc atccctccaggaataataca 4020 attgccaact ttacattctc agcttgggta atgcccaatg ccaatacgaatggattcatt 4080 atagcgaagg atgacggtaa tggaagcatc tactacgggg taaaaatacaaacaaacgaa 4140 tcccatgtga cactttccct tcattataaa accttgggtt ccaatgctacatacattgcc 4200 aagacaacag tcatgaaata tttagaagaa agtgtttggc ttcatctactaattatcctg 4260 gaggatggta taatcgaatt ctacctggat ggaaatgcaa tgcccaggggaatcaagagt 4320 ctgaaaggag aagccattac tgacggtcct gggatactga gaattggagcagggataaat 4380 ggcaatgaca gatttacagg tctgatgcag gatgtgaggt cctatgagcggaaactgacg 4440 cttgaagaaa tttatgaact tcatgccatg cccgcaaaaa gtgatttacacccaatttct 4500 ggatatctgg agttcagaca gggagaaact aacaaatcat tcattatttctgcaagagat 4560 gacaatgacg aggaaggaga agaattattc attcttaaac tagtttctgtatatggagga 4620 gctcgtattt cggaagaaaa tactgctgca agattaacaa tacaaaaaagtgacaatgca 4680 aatggcttgt ttggtttcac aggagcttgt ataccagaga ttgcagaggagggatcaacc 4740 atttcttgtg tggttgagag aaccagagga gctctggatt atgtgcatgttttttacacc 4800 atttcacaga ttgaaactga tggcattaat taccttgttg atgactttgctaatgccagt 4860 ggaactatta cattccttcc ttggcagaga tcagaggttc tgaatatatatgttcttgat 4920 gatgatattc ctgaacttaa tgagtatttc cgtgtgacat tggtttctgcaattcctgga 4980 gatgggaagc taggctcaac tcctaccagt ggtgcaagca tagatcctgaaaaggaaacg 5040 actgatatca ccatcaaagc tagtgatcat ccatatggct tgctgcagttctccacaggg 5100 ctgcctcctc agcctaagga cgcaatgacc ctgcctgcaa gcagcgttccacatatcact 5160 gtggaggagg aagatggaga aatcaggtta ttggtcatcc gtgcacagggacttctggga 5220 agggtgactg cggaatttag aacagtgtcc ttgacagcat tcagtcctgaggattaccag 5280 aatgttgctg gcacattaga atttcaacca ggagaaagat ataaatacattttcataaac 5340 atcactgata attctattcc tgaactggaa aaatctttta aagttgagttgttaaacttg 5400 gaaggaggag ctgaactctt tagggttgat ggaagtggta gtggtgatggggacatggaa 5460 ttcttccttc caactattca caaacgtgcc agtctaggag tggcttcccaaattctagtg 5520 acaattgcag cctctgacca cgctcatggc gtatttgaat ttagccctgagtcactcttt 5580 gtcagtggaa ctgaaccaga agatgggtat agcactgtta cattaaatgttataagacat 5640 catggaactc tgtctccagt gactttgcat tggaacatag actctgatcctgatggtgat 5700 ctcgccttca cctctggcaa catcacattt gagattgggc agacgagcgccaatatcact 5760 gtggagatat tgcctgacga agacccagaa ctggataagg cattctctgtgtcagtcctc 5820 agtgtttcca gtggttcttt gggagctcat attaatgcca cgttaacagttttggctagt 5880 gatgatccat atgggatatt catttttcct gagaaaaaca gacctgttaaagttgaggaa 5940 gcaacccaga acatcacact atcaataata aggttgaaag gcctcatgggaaaagtcctt 6000 gtctcatatg caacactaga tgctatggaa aaaccacctt attttccacctaatttagcg 6060 agagcaactc aaggaagaga ctatatacca gcttctggat ttgctctttttggagctaat 6120 cagagtgagg caacaatagc tatttcaatt ttggatgatg atgagccagaaaggtccgaa 6180 tctgtcttta tcgaactact caactctact ttagtagcga aagtacagagtcgttcaatt 6240 ccaaattctc cacgtcttgg gcctaaggta gaaactattg cgcaactaattatcattgcc 6300 aatgatgatg catttggaac tcttcagctc tcagcaccaa ttgtccgagtggcagaaaat 6360 catgttggac ccattatcaa tgtgactaga acaggaggag catttgcagatgtctctgtg 6420 aagtttaaag ctgtgccaat aactgcaata gctggtgaag attatagtatagcttcatca 6480 ggtgtggtct tgctagaagg ggaaaccagt aaagccgtgc caatatatgtcattaatgat 6540 atctatcctg aactgggaga atcttttctt gggcaactga tgaatgaaacgacaggagga 6600 gccagactag gggctttaac agaggcagtc attattattg aggcctctgatgacccctat 6660 ggattatttg ggtttcaaat tactaaactt attgtagagg aacctgagtttaactcagtg 6720 aaggtaaacc tgccaataat tcgaaattct gggacactcg gcaatgttactgttcagtgg 6780 gttgccacca ttaatggaca gcttgctact ggcgacctgc gagttgtctcaggtaatgtg 6840 acctttgccc ctggggaaac cattcaaacc ttgttgttag aggtcctggctgacgacgtt 6900 ccggagattg aagaggttat ccaagtgcaa ctaactgatg cctctggtggaggtactatt 6960 gggttagatc gaattgcaaa tattattatt cctgccaatg atgatccttatggtacagta 7020 gcctttgctc aggtggttta tcgtgttcaa gagcctctgg agagaagttcctatgctaac 7080 ataactgtca ggcgaagcgg agggcacttt ggtcggctgt tgttgttctacagtacttcc 7140 gacattgatg tagtggctct ggcaatggag gaaggtcaag atttactgtcctactatgaa 7200 tctccaattc aaggggtgcc tgacccactt tggagaactt ggatgaatgtctctgccgtg 7260 ggggagcccc tgtatacctg tgccactttg tgccttaagg aacaagcttgctcagcgttt 7320 tcatttttca gtgcttctga gggtccccag cgtttctgga tgacatcatggatcagccca 7380 gctgtcagca attcagactt ctggacctac aggaaaaaca tgaccagggtagcatctctt 7440 tttagtggtc aggctgtggc tgggagtgac tatgagcctg tgacaaggcaatgggccata 7500 atgcaggaag gtgatgaatt cgcaaatctc acagtgtcta ttcttcctgatgatttccca 7560 gagatggatg agagttttct aatttctctc cttgaagttc acctcatgaacatttcagcc 7620 agtttgaaaa atcagccaac cataggacag ccaaatattt ctacagttgtcatagcacta 7680 aatggtgatg cctttggagt gtttgtgatc tacagtatta gtcccaatacttccgaagat 7740 ggcttatttg ttgaagttca ggagcagccc caaaccttgg tggagctgatgatacacagg 7800 acagggggca gcttaggtca agtggcagtc gaatggcgtg ttgttggtggaacagctact 7860 gaaggtttag attttatagg tgctggagag attctgacct ttgctgaaggtgaaaccaaa 7920 aagacagtca ttttaaccat cttggatgac tctgaaccag aggatgacgaaagtatcata 7980 gttagtttgg tgtacactga aggtggaagt agaattttgc caagctccgacactgttaga 8040 gtgaacattt tggccaatga caatgtggca ggaattgtta gctttcagacagcttccaga 8100 tctgtcatag gtcatgaagg agaaatttta caattccatg tgataagaactttccctggt 8160 cgaggaaatg ttactgttaa ctggaaaatt attgggcaaa atctagaactcaattttgct 8220 aactttagcg gacaactttt ctttcctgag gggtcgttga atacaacattgtttgtgcat 8280 ttgttggatg acaacattcc tgaggagaaa gaagtatacc aagtcattctgtatgatgtc 8340 aggacacaag gagttccacc agccggaatc gccctgcttg atactcaaggatatgccgct 8400 gtcctcacag tagaagccag tgatgaacca catggagttt taaattttgctctttcatca 8460 agatttgtgt tactacaaga ggctaacata acaattcagc ttttcatcaacagagaattt 8520 ggatctctcg gagctatcaa tgtcacatat accacggttc ctggaatgctgagtctgaag 8580 aaccaaacag taggaaacct agcagagcca gaagttgatt ttgtccctatcattggcttt 8640 ctgattttag aagaagggga aacagcagca gccatcaaca ttaccattcttgaggatgat 8700 gtaccagagc tagaagaata tttcctggtg aatttaactt acgttggacttaccatggct 8760 gcttcaactt catttcctcc cagactaggt atgaggggtt tcttgtttgtttctttttgc 8820 tcacttcaaa tgaaatgaag aaacttcatt tttgaatcag aagtgatcattgtgctgttt 8880 tgttaatctt agctatgtgt taaaatatga tgggctttta tatttatttttgatactctc 8940 atatattgca atttttacaa tgaacaatgt aaagacatta aaaattattgtgtgatgctc 9000 tttaaatttt acaactat 9018 4 2777 PRT Homo sapiens 4 MetVal Met Val Thr Phe Glu Val Glu Gly Gly Pro Asn Pro Pro Asp 1 5 10 15Glu Asp Leu Ser Pro Val Lys Gly Asn Ile Thr Phe Pro Pro Gly Arg 20 25 30Ala Thr Val Ile Tyr Asn Leu Thr Val Leu Asp Asp Glu Val Pro Glu 35 40 45Asn Asp Glu Ile Phe Leu Ile Gln Leu Lys Ser Val Glu Gly Gly Ala 50 55 60Glu Ile Asn Thr Ser Arg Asn Ser Ile Glu Ile Ile Ile Lys Lys Asn 65 70 7580 Asp Ser Pro Val Arg Phe Leu Gln Ser Ile Tyr Leu Val Pro Glu Glu 85 9095 Asp His Ile Leu Ile Ile Pro Val Val Arg Gly Lys Asp Asn Asn Gly 100105 110 Asn Leu Ile Gly Ser Asp Glu Tyr Glu Val Ser Ile Ser Tyr Ala Val115 120 125 Thr Thr Gly Asn Ser Thr Ala His Ala Gln Gln Asn Leu Asp PheIle 130 135 140 Asp Leu Gln Pro Asn Thr Thr Val Val Phe Pro Pro Phe IleHis Glu 145 150 155 160 Ser His Leu Lys Phe Gln Ile Val Asp Asp Thr IlePro Glu Ile Ala 165 170 175 Glu Ser Phe His Ile Met Leu Leu Lys Asp ThrLeu Gln Gly Asp Ala 180 185 190 Val Leu Ile Ser Pro Ser Val Val Gln ValThr Ile Lys Pro Asn Asp 195 200 205 Lys Pro Tyr Gly Val Leu Ser Phe AsnSer Val Leu Phe Glu Arg Thr 210 215 220 Val Ile Ile Asp Glu Asp Arg IleSer Arg Tyr Glu Glu Ile Thr Val 225 230 235 240 Val Arg Asn Gly Gly ThrHis Gly Asn Val Ser Ala Asn Trp Val Leu 245 250 255 Thr Arg Asn Ser ThrAsp Pro Ser Pro Val Thr Ala Asp Ile Arg Pro 260 265 270 Ser Ser Gly ValLeu His Phe Ala Gln Gly Gln Met Leu Ala Thr Ile 275 280 285 Pro Leu ThrVal Val Asp Asp Asp Leu Pro Glu Glu Ala Glu Ala Tyr 290 295 300 Leu LeuGln Ile Leu Pro His Thr Ile Arg Gly Gly Ala Glu Val Ser 305 310 315 320Glu Pro Ala Glu Leu Leu Phe Tyr Ile Gln Asp Ser Asp Asp Val Tyr 325 330335 Gly Leu Ile Thr Phe Phe Pro Met Glu Asn Gln Lys Ile Glu Ser Ser 340345 350 Pro Gly Glu Arg Tyr Leu Ser Leu Ser Phe Thr Arg Leu Gly Gly Thr355 360 365 Lys Gly Asp Val Arg Leu Leu Tyr Ser Val Leu Tyr Ile Pro AlaGly 370 375 380 Ala Val Asp Pro Leu Gln Ala Lys Glu Gly Ile Leu Asn IleSer Gly 385 390 395 400 Arg Asn Asp Leu Ile Phe Pro Glu Gln Lys Thr GlnVal Thr Thr Lys 405 410 415 Leu Pro Ile Arg Asn Asp Ala Phe Leu Gln AsnGly Ala His Phe Leu 420 425 430 Val Gln Leu Glu Thr Val Glu Leu Leu AsnIle Ile Pro Leu Ile Pro 435 440 445 Pro Ile Ser Pro Arg Phe Gly Glu IleCys Asn Ile Ser Leu Leu Val 450 455 460 Thr Pro Ala Ile Ala Asn Gly GluIle Gly Phe Leu Ser Asn Leu Pro 465 470 475 480 Ile Ile Leu His Glu LeuGlu Asp Phe Ala Ala Glu Val Val Tyr Ile 485 490 495 Pro Leu His Arg AspGly Thr Asp Gly Gln Ala Thr Val Tyr Trp Ser 500 505 510 Leu Lys Pro SerGly Phe Asn Ser Lys Ala Val Thr Pro Asp Asp Ile 515 520 525 Gly Pro PheAsn Gly Ser Val Leu Phe Leu Ser Gly Gln Ser Asp Thr 530 535 540 Thr IleAsn Ile Thr Ile Lys Gly Asp Asp Ile Pro Glu Met Asn Glu 545 550 555 560Thr Val Thr Leu Ser Leu Asp Arg Val Asn Val Glu Asn Gln Val Leu 565 570575 Lys Ser Gly Tyr Thr Ser Arg Asp Leu Ile Ile Leu Glu Asn Asp Asp 580585 590 Pro Gly Gly Val Phe Glu Phe Ser Pro Ala Ser Arg Gly Pro Tyr Val595 600 605 Ile Lys Glu Gly Glu Ser Val Glu Leu His Ile Ile Arg Ser ArgGly 610 615 620 Ser Leu Val Lys Gln Phe Leu His Tyr Arg Val Glu Pro ArgAsp Ser 625 630 635 640 Asn Glu Phe Tyr Gly Asn Thr Gly Val Leu Glu PheLys Pro Gly Glu 645 650 655 Arg Glu Ile Val Ile Thr Leu Leu Ala Arg LeuAsp Gly Ile Pro Glu 660 665 670 Leu Asp Glu His Tyr Trp Val Val Leu SerSer His Gly Glu Arg Glu 675 680 685 Ser Lys Leu Gly Ser Ala Thr Ile ValAsn Ile Thr Ile Leu Lys Asn 690 695 700 Asp Asp Pro His Gly Ile Ile GluPhe Val Ser Asp Gly Leu Ile Val 705 710 715 720 Met Ile Asn Glu Ser LysGly Asp Ala Ile Tyr Ser Ala Val Tyr Asp 725 730 735 Val Val Arg Asn ArgGly Asn Phe Gly Asp Val Ser Val Ser Trp Val 740 745 750 Val Ser Pro AspPhe Thr Gln Asp Val Phe Pro Val Gln Gly Thr Val 755 760 765 Val Phe GlyAsp Gln Glu Phe Ser Lys Asn Ile Thr Ile Tyr Ser Leu 770 775 780 Pro AspGlu Ile Pro Glu Glu Met Glu Glu Phe Thr Val Ile Leu Leu 785 790 795 800Asn Gly Thr Gly Gly Ala Lys Val Gly Asn Arg Thr Thr Ala Thr Leu 805 810815 Arg Ile Arg Arg Asn Asp Asp Pro Ile Tyr Phe Ala Glu Pro Arg Val 820825 830 Val Arg Val Gln Glu Gly Glu Thr Ala Asn Phe Thr Val Leu Arg Asn835 840 845 Gly Ser Val Asp Val Thr Cys Met Val Gln Tyr Ala Thr Lys AspGly 850 855 860 Lys Ala Thr Ala Arg Glu Arg Asp Phe Ile Pro Val Glu LysGly Glu 865 870 875 880 Thr Leu Ile Phe Glu Val Gly Ser Arg Gln Gln SerIle Ser Ile Phe 885 890 895 Val Asn Glu Asp Gly Ile Pro Glu Thr Asp GluPro Phe Tyr Ile Ile 900 905 910 Leu Leu Asn Ser Thr Gly Asp Thr Val ValTyr Gln Tyr Gly Val Ala 915 920 925 Thr Val Ile Ile Glu Ala Asn Asp AspPro Asn Gly Ile Phe Ser Leu 930 935 940 Glu Pro Ile Asp Lys Ala Val GluGlu Gly Lys Thr Asn Ala Phe Trp 945 950 955 960 Ile Leu Arg His Arg GlyTyr Phe Gly Ser Val Ser Val Ser Trp Gln 965 970 975 Leu Phe Gln Asn AspSer Ala Leu Gln Pro Gly Gln Glu Phe Tyr Glu 980 985 990 Thr Ser Gly ThrVal Asn Phe Met Asp Gly Glu Glu Ala Lys Pro Ile 995 1000 1005 Ile LeuHis Ala Phe Pro Asp Lys Ile Pro Glu Phe Asn Glu Phe 1010 1015 1020 TyrPhe Leu Lys Leu Val Asn Ile Ser Gly Gly Ser Pro Gly Pro 1025 1030 1035Gly Gly Gln Leu Ala Glu Thr Asn Leu Gln Val Thr Val Met Val 1040 10451050 Pro Phe Asn Asp Asp Pro Phe Gly Val Phe Ile Leu Asp Pro Glu 10551060 1065 Cys Leu Glu Arg Glu Val Ala Glu Asp Val Leu Ser Glu Asp Asp1070 1075 1080 Met Ser Tyr Ile Thr Asn Phe Thr Ile Leu Arg Gln Gln GlyVal 1085 1090 1095 Phe Gly Asp Val Gln Leu Gly Trp Glu Ile Leu Ser SerGlu Phe 1100 1105 1110 Pro Ala Gly Leu Pro Pro Met Ile Asp Phe Leu LeuVal Gly Ile 1115 1120 1125 Phe Pro Thr Thr Val His Leu Gln Gln His MetArg Arg His His 1130 1135 1140 Ser Gly Thr Asp Ala Leu Tyr Phe Thr GlyLeu Glu Gly Ala Phe 1145 1150 1155 Gly Thr Val Asn Pro Lys Tyr His ProSer Arg Asn Asn Thr Ile 1160 1165 1170 Ala Asn Phe Thr Phe Ser Ala TrpVal Met Pro Asn Ala Asn Thr 1175 1180 1185 Asn Gly Phe Ile Ile Ala LysAsp Asp Gly Asn Gly Ser Ile Tyr 1190 1195 1200 Tyr Gly Val Lys Ile GlnThr Asn Glu Ser His Val Thr Leu Ser 1205 1210 1215 Leu His Tyr Lys ThrLeu Gly Ser Asn Ala Thr Tyr Ile Ala Lys 1220 1225 1230 Thr Thr Val MetLys Tyr Leu Glu Glu Ser Val Trp Leu His Leu 1235 1240 1245 Leu Ile IleLeu Glu Asp Gly Ile Ile Glu Phe Tyr Leu Asp Gly 1250 1255 1260 Asn AlaMet Pro Arg Gly Ile Lys Ser Leu Lys Gly Glu Ala Ile 1265 1270 1275 ThrAsp Gly Pro Gly Ile Leu Arg Ile Gly Ala Gly Ile Asn Gly 1280 1285 1290Asn Asp Arg Phe Thr Gly Leu Met Gln Asp Val Arg Ser Tyr Glu 1295 13001305 Arg Lys Leu Thr Leu Glu Glu Ile Tyr Glu Leu His Ala Met Pro 13101315 1320 Ala Lys Ser Asp Leu His Pro Ile Ser Gly Tyr Leu Glu Phe Arg1325 1330 1335 Gln Gly Glu Thr Asn Lys Ser Phe Ile Ile Ser Ala Arg AspAsp 1340 1345 1350 Asn Asp Glu Glu Gly Glu Glu Leu Phe Ile Leu Lys LeuVal Ser 1355 1360 1365 Val Tyr Gly Gly Ala Arg Ile Ser Glu Glu Asn ThrAla Ala Arg 1370 1375 1380 Leu Thr Ile Gln Lys Ser Asp Asn Ala Asn GlyLeu Phe Gly Phe 1385 1390 1395 Thr Gly Ala Cys Ile Pro Glu Ile Ala GluGlu Gly Ser Thr Ile 1400 1405 1410 Ser Cys Val Val Glu Arg Thr Arg GlyAla Leu Asp Tyr Val His 1415 1420 1425 Val Phe Tyr Thr Ile Ser Gln IleGlu Thr Asp Gly Ile Asn Tyr 1430 1435 1440 Leu Val Asp Asp Phe Ala AsnAla Ser Gly Thr Ile Thr Phe Leu 1445 1450 1455 Pro Trp Gln Arg Ser GluVal Leu Asn Ile Tyr Val Leu Asp Asp 1460 1465 1470 Asp Ile Pro Glu LeuAsn Glu Tyr Phe Arg Val Thr Leu Val Ser 1475 1480 1485 Ala Ile Pro GlyAsp Gly Lys Leu Gly Ser Thr Pro Thr Ser Gly 1490 1495 1500 Ala Ser IleAsp Pro Glu Lys Glu Thr Thr Asp Ile Thr Ile Lys 1505 1510 1515 Ala SerAsp His Pro Tyr Gly Leu Leu Gln Phe Ser Thr Gly Leu 1520 1525 1530 ProPro Gln Pro Lys Asp Ala Met Thr Leu Pro Ala Ser Ser Val 1535 1540 1545Pro His Ile Thr Val Glu Glu Glu Asp Gly Glu Ile Arg Leu Leu 1550 15551560 Val Ile Arg Ala Gln Gly Leu Leu Gly Arg Val Thr Ala Glu Phe 15651570 1575 Arg Thr Val Ser Leu Thr Ala Phe Ser Pro Glu Asp Tyr Gln Asn1580 1585 1590 Val Ala Gly Thr Leu Glu Phe Gln Pro Gly Glu Arg Tyr LysTyr 1595 1600 1605 Ile Phe Ile Asn Ile Thr Asp Asn Ser Ile Pro Glu LeuGlu Lys 1610 1615 1620 Ser Phe Lys Val Glu Leu Leu Asn Leu Glu Gly GlyAla Glu Leu 1625 1630 1635 Phe Arg Val Asp Gly Ser Gly Ser Gly Asp GlyAsp Met Glu Phe 1640 1645 1650 Phe Leu Pro Thr Ile His Lys Arg Ala SerLeu Gly Val Ala Ser 1655 1660 1665 Gln Ile Leu Val Thr Ile Ala Ala SerAsp His Ala His Gly Val 1670 1675 1680 Phe Glu Phe Ser Pro Glu Ser LeuPhe Val Ser Gly Thr Glu Pro 1685 1690 1695 Glu Asp Gly Tyr Ser Thr ValThr Leu Asn Val Ile Arg His His 1700 1705 1710 Gly Thr Leu Ser Pro ValThr Leu His Trp Asn Ile Asp Ser Asp 1715 1720 1725 Pro Asp Gly Asp LeuAla Phe Thr Ser Gly Asn Ile Thr Phe Glu 1730 1735 1740 Ile Gly Gln ThrSer Ala Asn Ile Thr Val Glu Ile Leu Pro Asp 1745 1750 1755 Glu Asp ProGlu Leu Asp Lys Ala Phe Ser Val Ser Val Leu Ser 1760 1765 1770 Val SerSer Gly Ser Leu Gly Ala His Ile Asn Ala Thr Leu Thr 1775 1780 1785 ValLeu Ala Ser Asp Asp Pro Tyr Gly Ile Phe Ile Phe Pro Glu 1790 1795 1800Lys Asn Arg Pro Val Lys Val Glu Glu Ala Thr Gln Asn Ile Thr 1805 18101815 Leu Ser Ile Ile Arg Leu Lys Gly Leu Met Gly Lys Val Leu Val 18201825 1830 Ser Tyr Ala Thr Leu Asp Ala Met Glu Lys Pro Pro Tyr Phe Pro1835 1840 1845 Pro Asn Leu Ala Arg Ala Thr Gln Gly Arg Asp Tyr Ile ProAla 1850 1855 1860 Ser Gly Phe Ala Leu Phe Gly Ala Asn Gln Ser Glu AlaThr Ile 1865 1870 1875 Ala Ile Ser Ile Leu Asp Asp Asp Glu Pro Glu ArgSer Glu Ser 1880 1885 1890 Val Phe Ile Glu Leu Leu Asn Ser Thr Leu ValAla Lys Val Gln 1895 1900 1905 Ser Arg Ser Ile Pro Asn Ser Pro Arg LeuGly Pro Lys Val Glu 1910 1915 1920 Thr Ile Ala Gln Leu Ile Ile Ile AlaAsn Asp Asp Ala Phe Gly 1925 1930 1935 Thr Leu Gln Leu Ser Ala Pro IleVal Arg Val Ala Glu Asn His 1940 1945 1950 Val Gly Pro Ile Ile Asn ValThr Arg Thr Gly Gly Ala Phe Ala 1955 1960 1965 Asp Val Ser Val Lys PheLys Ala Val Pro Ile Thr Ala Ile Ala 1970 1975 1980 Gly Glu Asp Tyr SerIle Ala Ser Ser Gly Val Val Leu Leu Glu 1985 1990 1995 Gly Glu Thr SerLys Ala Val Pro Ile Tyr Val Ile Asn Asp Ile 2000 2005 2010 Tyr Pro GluLeu Gly Glu Ser Phe Leu Gly Gln Leu Met Asn Glu 2015 2020 2025 Thr ThrGly Gly Ala Arg Leu Gly Ala Leu Thr Glu Ala Val Ile 2030 2035 2040 IleIle Glu Ala Ser Asp Asp Pro Tyr Gly Leu Phe Gly Phe Gln 2045 2050 2055Ile Thr Lys Leu Ile Val Glu Glu Pro Glu Phe Asn Ser Val Lys 2060 20652070 Val Asn Leu Pro Ile Ile Arg Asn Ser Gly Thr Leu Gly Asn Val 20752080 2085 Thr Val Gln Trp Val Ala Thr Ile Asn Gly Gln Leu Ala Thr Gly2090 2095 2100 Asp Leu Arg Val Val Ser Gly Asn Val Thr Phe Ala Pro GlyGlu 2105 2110 2115 Thr Ile Gln Thr Leu Leu Leu Glu Val Leu Ala Asp AspVal Pro 2120 2125 2130 Glu Ile Glu Glu Val Ile Gln Val Gln Leu Thr AspAla Ser Gly 2135 2140 2145 Gly Gly Thr Ile Gly Leu Asp Arg Ile Ala AsnIle Ile Ile Pro 2150 2155 2160 Ala Asn Asp Asp Pro Tyr Gly Thr Val AlaPhe Ala Gln Val Val 2165 2170 2175 Tyr Arg Val Gln Glu Pro Leu Glu ArgSer Ser Tyr Ala Asn Ile 2180 2185 2190 Thr Val Arg Arg Ser Gly Gly HisPhe Gly Arg Leu Leu Leu Phe 2195 2200 2205 Tyr Ser Thr Ser Asp Ile AspVal Val Ala Leu Ala Met Glu Glu 2210 2215 2220 Gly Gln Asp Leu Leu SerTyr Tyr Glu Ser Pro Ile Gln Gly Val 2225 2230 2235 Pro Asp Pro Leu TrpArg Thr Trp Met Asn Val Ser Ala Val Gly 2240 2245 2250 Glu Pro Leu TyrThr Cys Ala Thr Leu Cys Leu Lys Glu Gln Ala 2255 2260 2265 Cys Ser AlaPhe Ser Phe Phe Ser Ala Ser Glu Gly Pro Gln Arg 2270 2275 2280 Phe TrpMet Thr Ser Trp Ile Ser Pro Ala Val Ser Asn Ser Asp 2285 2290 2295 PheTrp Thr Tyr Arg Lys Asn Met Thr Arg Val Ala Ser Leu Phe 2300 2305 2310Ser Gly Gln Ala Val Ala Gly Ser Asp Tyr Glu Pro Val Thr Arg 2315 23202325 Gln Trp Ala Ile Met Gln Glu Gly Asp Glu Phe Ala Asn Leu Thr 23302335 2340 Val Ser Ile Leu Pro Asp Asp Phe Pro Glu Met Asp Glu Ser Phe2345 2350 2355 Leu Ile Ser Leu Leu Glu Val His Leu Met Asn Ile Ser AlaSer 2360 2365 2370 Leu Lys Asn Gln Pro Thr Ile Gly Gln Pro Asn Ile SerThr Val 2375 2380 2385 Val Ile Ala Leu Asn Gly Asp Ala Phe Gly Val PheVal Ile Tyr 2390 2395 2400 Ser Ile Ser Pro Asn Thr Ser Glu Asp Gly LeuPhe Val Glu Val 2405 2410 2415 Gln Glu Gln Pro Gln Thr Leu Val Glu LeuMet Ile His Arg Thr 2420 2425 2430 Gly Gly Ser Leu Gly Gln Val Ala ValGlu Trp Arg Val Val Gly 2435 2440 2445 Gly Thr Ala Thr Glu Gly Leu AspPhe Ile Gly Ala Gly Glu Ile 2450 2455 2460 Leu Thr Phe Ala Glu Gly GluThr Lys Lys Thr Val Ile Leu Thr 2465 2470 2475 Ile Leu Asp Asp Ser GluPro Glu Asp Asp Glu Ser Ile Ile Val 2480 2485 2490 Ser Leu Val Tyr ThrGlu Gly Gly Ser Arg Ile Leu Pro Ser Ser 2495 2500 2505 Asp Thr Val ArgVal Asn Ile Leu Ala Asn Asp Asn Val Ala Gly 2510 2515 2520 Ile Val SerPhe Gln Thr Ala Ser Arg Ser Val Ile Gly His Glu 2525 2530 2535 Gly GluIle Leu Gln Phe His Val Ile Arg Thr Phe Pro Gly Arg 2540 2545 2550 GlyAsn Val Thr Val Asn Trp Lys Ile Ile Gly Gln Asn Leu Glu 2555 2560 2565Leu Asn Phe Ala Asn Phe Ser Gly Gln Leu Phe Phe Pro Glu Gly 2570 25752580 Ser Leu Asn Thr Thr Leu Phe Val His Leu Leu Asp Asp Asn Ile 25852590 2595 Pro Glu Glu Lys Glu Val Tyr Gln Val Ile Leu Tyr Asp Val Arg2600 2605 2610 Thr Gln Gly Val Pro Pro Ala Gly Ile Ala Leu Leu Asp ThrGln 2615 2620 2625 Gly Tyr Ala Ala Val Leu Thr Val Glu Ala Ser Asp GluPro His 2630 2635 2640 Gly Val Leu Asn Phe Ala Leu Ser Ser Arg Phe ValLeu Leu Gln 2645 2650 2655 Glu Ala Asn Ile Thr Ile Gln Leu Phe Ile AsnArg Glu Phe Gly 2660 2665 2670 Ser Leu Gly Ala Ile Asn Val Thr Tyr ThrThr Val Pro Gly Met 2675 2680 2685 Leu Ser Leu Lys Asn Gln Thr Val GlyAsn Leu Ala Glu Pro Glu 2690 2695 2700 Val Asp Phe Val Pro Ile Ile GlyPhe Leu Ile Leu Glu Glu Gly 2705 2710 2715 Glu Thr Ala Ala Ala Ile AsnIle Thr Ile Leu Glu Asp Asp Val 2720 2725 2730 Pro Glu Leu Glu Glu TyrPhe Leu Val Asn Leu Thr Tyr Val Gly 2735 2740 2745 Leu Thr Met Ala AlaSer Thr Ser Phe Pro Pro Arg Leu Gly Met 2750 2755 2760 Arg Gly Phe LeuPhe Val Ser Phe Cys Ser Leu Gln Met Lys 2765 2770 2775 5 35 PRT Musmusculus 5 Gly Asn Ile Thr Phe Pro Pro Gly Arg Ala Thr Val Ile Tyr AsnVal 1 5 10 15 Thr Val Leu Asp Asp Glu Val Pro Glu Asn Asp Glu Leu PheLeu Ile 20 25 30 Gln Leu Arg 35 6 35 PRT Mus musculus 6 Thr Thr Leu ValPhe Pro Pro Phe Val His Glu Ser His Leu Lys Phe 1 5 10 15 Gln Ile IleAsp Asp Leu Ile Pro Glu Ile Ala Glu Ser Phe His Ile 20 25 30 Met Leu Leu35 7 35 PRT Mus musculus 7 Gly Thr Leu Gln Phe Ala Gln Gly Gln Met LeuAla Pro Ile Ser Leu 1 5 10 15 Val Val Phe Asp Asp Asp Leu Pro Glu GluAla Glu Ala Tyr Leu Leu 20 25 30 Thr Ile Leu 35 8 35 PRT Mus musculus 8Gly Ser Val Val Phe Leu Ser Gly Gln Asn Glu Thr Ser Ile Asn Ile 1 5 1015 Thr Val Lys Gly Asp Asp Ile Pro Glu Leu Asn Glu Thr Val Thr Leu 20 2530 Ser Leu Asp 35 9 35 PRT Mus musculus 9 Gly Val Leu Glu Phe Thr ProGly Glu Arg Glu Val Val Ile Thr Leu 1 5 10 15 Leu Thr Arg Leu Asp GlyThr Pro Glu Leu Asp Glu His Phe Trp Ala 20 25 30 Ile Leu Ser 35 10 35PRT Mus musculus 10 Gly Thr Val Cys Phe Gly Asp Gln Glu Phe Phe Lys AsnIle Thr Val 1 5 10 15 Tyr Ser Leu Val Asp Glu Ile Pro Glu Glu Met GluGlu Phe Thr Ile 20 25 30 Ile Leu Leu 35 11 35 PRT Mus musculus 11 GluThr Leu Val Phe Glu Val Gly Ser Arg Glu Gln Ser Ile Ser Val 1 5 10 15His Val Lys Asp Asp Gly Ile Pro Glu Thr Asp Glu Pro Phe Tyr Ile 20 25 30Val Leu Phe 35 12 35 PRT Mus musculus 12 Gly Thr Val Asn Phe Thr Asp GlyGlu Glu Thr Lys Pro Val Ile Leu 1 5 10 15 Arg Ala Phe Pro Asp Arg IlePro Glu Phe Asn Glu Phe Tyr Ile Leu 20 25 30 Arg Leu Val 35 13 35 PRTMus musculus 13 Gly Thr Ile Thr Phe Leu Pro Trp Gln Arg Ser Glu Val LeuAsn Leu 1 5 10 15 Tyr Val Leu Asp Glu Asp Met Pro Glu Leu Asn Glu TyrPhe Arg Val 20 25 30 Thr Leu Val 35 14 35 PRT Mus musculus 14 Gly ThrLeu Glu Phe Gln Ser Gly Glu Arg Tyr Lys Tyr Ile Phe Val 1 5 10 15 AsnIle Thr Asp Asn Ser Ile Pro Glu Leu Glu Lys Ser Phe Lys Val 20 25 30 GluLeu Leu 35 15 35 PRT Mus musculus 15 Gly Asn Ile Thr Phe Glu Thr Gly GlnArg Ile Ala Ser Ile Thr Val 1 5 10 15 Glu Ile Leu Pro Asp Glu Glu ProGlu Leu Asp Lys Ala Leu Thr Val 20 25 30 Ser Ile Leu 35 16 35 PRT Musmusculus 16 Gly Leu Ala Leu Phe Arg Ala Asn Gln Thr Glu Ala Thr Ile ThrIle 1 5 10 15 Ser Ile Leu Asp Asp Ala Glu Pro Glu Arg Ser Glu Ser ValPhe Ile 20 25 30 Glu Leu Phe 35 17 35 PRT Mus musculus 17 Ser Asp ValVal Leu Leu Glu Gly Glu Thr Thr Lys Ala Val Pro Ile 1 5 10 15 Tyr IleIle Asn Asp Ile Tyr Pro Glu Leu Glu Glu Thr Phe Leu Val 20 25 30 Gln LeuLeu 35 18 35 PRT Mus musculus 18 Gly Asn Val Thr Phe Ala Pro Gly Glu ThrIle Gln Thr Leu Leu Leu 1 5 10 15 Glu Val Leu Ala Asp Asp Val Pro GluIle Glu Glu Val Val Gln Val 20 25 30 Gln Leu Ala 35 19 35 PRT Musmusculus 19 Gln Trp Ala Val Ile Leu Glu Gly Asp Glu Phe Ala Asn Leu ThrVal 1 5 10 15 Ser Val Leu Pro Asp Asp Ala Pro Glu Met Asp Glu Ser PheLeu Ile 20 25 30 Ser Leu Leu 35 20 35 PRT Mus musculus 20 Asp Ile LeuThr Phe Ala Glu Gly Glu Thr Lys Lys Met Ala Ile Leu 1 5 10 15 Thr IleLeu Asp Asp Ser Glu Pro Glu Asp Asn Glu Ser Ile Leu Val 20 25 30 Arg LeuVal 35 21 35 PRT Mus musculus 21 Gly Gln Leu Phe Phe Ser Glu Phe Thr LeuAsn Lys Thr Ile Phe Val 1 5 10 15 His Leu Leu Asp Asp Asn Ile Pro GluGlu Lys Glu Val Tyr Gln Val 20 25 30 Val Leu Tyr 35 22 35 PRT Musmusculus 22 Gly Ser Leu Val Leu Glu Glu Gly Glu Thr Thr Ala Ala Ile SerIle 1 5 10 15 Thr Val Leu Glu Asp Asp Ile Pro Glu Leu Lys Glu Tyr PheLeu Val 20 25 30 Asn Leu Thr 35 23 35 PRT Mus musculus 23 Gly Thr LeuVal Phe Leu Glu Gly Glu Thr Glu Ala Asn Ile Thr Val 1 5 10 15 Thr ValLeu Asp Asp Asp Ile Pro Glu Leu Asp Glu Ser Phe Leu Val 20 25 30 Val LeuLeu 35 24 35 PRT Mus musculus 24 Gly Thr Val Ile Phe Lys Pro Gly Glu ThrGln Lys Glu Ile Arg Val 1 5 10 15 Gly Ile Ile Asp Asp Asp Ile Phe GluGlu Asp Glu Asn Phe Leu Val 20 25 30 His Leu Ser 35 25 35 PRT Musmusculus 25 Leu Thr Leu Ile Phe Leu Asp Gly Glu Arg Glu Arg Lys Val SerVal 1 5 10 15 Gln Ile Leu Asp Asp Asp Glu Pro Glu Gly Gln Glu Phe PheTyr Val 20 25 30 Phe Leu Thr 35 26 35 PRT Mus musculus 26 Gly Glu ProGlu Phe Glx Asn Asp Glu Ile Val Lys Thr Ile Ser Val 1 5 10 15 Lys ValIle Asp Asp Glu Glu Tyr Glu Lys Asn Lys Thr Phe Phe Ile 20 25 30 Glu IleGly 35 27 5 PRT Artificial consensus sequence 27 Pro Glu Xaa Xaa Glu 1 528 19 DNA Artificial synthetic oligonucleotide 28 cagaggatgg atacagtac19 29 20 DNA Artificial synthetic oligonucleotide 29 gtaatctcctccttgagttg 20 30 19 DNA Artificial synthetic oligonucleotide 30gcagtgtgtt ggcatagag 19 31 18 DNA Artificial synthetic oligonucleotide31 agatcctgac cgagcgtg 18 32 21 DNA Artificial synthetic oligonucleotide32 tttattgtag aggaacctga g 21 33 18 DNA Artificial syntheticoligonucleotide 33 gccagtagca aactgtcc 18

What is claimed is:
 1. An isolated and purified nucleic acid, thenucleic acid comprising nucleotides which code for the amino acidsequence of SEQ ID NO:
 4. 2. A recombinant vector comprising the nucleicacid molecule of claim
 1. 3. The recombinant vector of claim 2, whereinthe recombinant vector is a plasmid.
 4. The recombinant vector of claim2, wherein the recombinant vector is a prokaryotic or eukaryoticexpression vector.
 5. The recombinant vector of claim 2, wherein thenucleic acid molecule is operably linked to a heterologous promoter. 6.A host cell comprising the vector of claim
 2. 7. The host cell of claim6, wherein the host cell is a eukaryotic host cell.
 8. The host cell ofclaim 6, wherein the host cell is a prokaryotic host cell.
 9. Anisolated and purified nucleic acid which codes for human monogenicaudiogenic seizure-susceptible protein, the nucleic acid comprising thenucleotide sequence of SEQ ID NO:
 3. 10. An isolated and purifiednucleic acid comprising the nucleotide sequence of SEQ ID NO: 3 or anucleotide sequence complementary to the nucleotide sequence of SEQ IDNO:
 3. 11. A recombinant vector comprising the nucleic acid molecule ofclaim
 10. 12. The recombinant vector of claim 11, wherein therecombinant vector is a plasmid.
 13. The recombinant vector of claim 11,wherein the recombinant vector is a prokaryotic or eukaryotic expressionvector.
 14. The recombinant vector of claim 11, wherein the nucleic acidmolecule is operably linked to a heterologous promoter.
 15. A host cellcomprising the vector of claim
 11. 16. The host cell of claim 15,wherein the host cell is a eukaryotic host cell.
 17. The host cell ofclaim 15, wherein the host cell is a prokaryotic host cell.
 18. Anisolated and purified nucleic acid, the nucleic acid comprisingnucleotides which code for the amino acid sequence of SEQ ID NO:
 2. 19.A recombinant vector comprising the nucleic acid molecule of claim 18.20. The recombinant vector of claim 19, wherein the recombinant vectoris a plasmid.
 21. The recombinant vector of claim 19, wherein therecombinant vector is a prokaryotic or eukaryotic expression vector. 22.The recombinant vector of claim 19, wherein the nucleic acid molecule isoperably linked to a heterologous promoter.
 23. A host cell comprisingthe vector of claim
 19. 24. The host cell of claim 23, wherein the hostcell is a eukaryotic host cell.
 25. The host cell of claim 23, whereinthe host cell is a prokaryotic host cell.
 26. An isolated and purifiednucleic acid which codes for murine microgenic audiogenicseizure-susceptible protein, the nucleic acid comprising the nucleotidesequence of SEQ ID NO:
 1. 27. An isolated and purified nucleic acidcomprising the nucleotide sequence of SEQ ID NO: 1 or a nucleotidesequence complementary to the nucleotide sequence of SEQ ID NO:
 1. 28. Arecombinant vector comprising the nucleic acid molecule of claim
 27. 29.The recombinant vector of claim 28, wherein the recombinant vector is aplasmid.
 30. The recombinant vector of claim 28, wherein the recombinantvector is a prokaryotic or eukaryotic expression vector.
 31. Therecombinant vector of claim 28, wherein the nucleic acid molecule isoperably linked to a heterologous promoter.
 32. A host cell comprisingthe vector of claim
 28. 33. The host cell of claim 32, wherein the hostcell is a eukaryotic host cell.
 34. The host cell of claim 32, whereinthe host cell is a prokaryotic host cell.